The present disclosure relates generally to a field of enterprise messaging, and more specifically, to a system and method of implementing a distributed and reliable communication system for clients connected to a messaging provider that receives messages from a shared resource or a queue.
Production software environments typically have a low-failure tolerance. Consequently, production software environments often require very short down times to ensure consistent service. Environments which provide very short down times are commonly also known as “high availability” processing systems and often use a “hot standby” configuration where one processing node of the system is in an idle state and only becomes active when, for any reason, the active processing node fails. This contingency approach means that the standby node takes over control automatically.
In addition, as network functionality increases, it becomes increasingly more important for systems to allow applications and application components to be distributed across networks (e.g., on multiple application servers). For applications and application components to be effectively distributed, various distributed parts of applications and application components (i.e. nodes) need to be able to communicate with each other. Nodes may communicate with each other using messaging to exchange information. To facilitate the exchange of messages, developers will often use a message-oriented middleware (MOM) framework (or system of software components and conventions that provide a message-oriented middleware architecture and features; see generally Chappell, “Enterprise Message Bus”, O'Reilly (2004)) via MOM Providers. By using a MOM Provider, the information may be sent and received by nodes using only a predetermined message format and a destination address for the message. A node may be a software component or process that runs on a common computer or different computers connected by a network or networks. A node may be a message producer and/or a consumer. The predetermined message format may include a message header for message identification, a properties section for application-specific, provider-specific, and optional header fields, and a body section that contains the content of the message. The content of a message may include text, data packets, objects, or other information to be communicated between nodes.
Several different types of messaging systems may be used for communicating between nodes including point-to-point and publish-and-subscribe. As seen in
In addition, as seen in
To facilitate sending and receiving messages, nodes typically use MOM providers built to handle the special requirements of messaging within an enterprise (e.g., IBM's Websphere® MQ) to connect to a messaging agent for implementing message queues and/or topics. Enterprise messaging requires a reliable, flexible service for the asynchronous exchange of critical business data and events throughout an enterprise (or other message exchanging entity). One example of a messaging agent may be implemented according to the Java® Messaging Service (JMS). The JMS Application Programming Interface (API) operates in conjunction with an individual enterprise messaging providers' API (e.g., IBM's Websphere® MQ offers such an API) to enable the development of portable, message based applications in the Java® programming language. Under JMS, an Enterprise Messaging MOM Provider is also referred to as a JMS Provider. Messages may be sent and received asynchronously, and nodes sending and receiving messages do not typically need to know anything about the nodes they are communicating with. This allows more freedom between nodes and makes it easier to design interfaces between nodes and easier to distribute applications and application components across a network.
The addition of an Enterprise Messaging system allows for more robust system development. For example, the JMS API enhances the Java® 2 Enterprise Edition (or “J2EE”) platform by simplifying enterprise development, allowing loosely coupled, reliable, asynchronous interactions among J2EE components and legacy systems capable of messaging. Using JMS, in addition to J2EE, developers can easily add new features, in a robust manner, to a J2EE application with existing business events by adding new components (e.g., message-driven bean, as defined within the Java® API) to operate on specific business events.
In a high availability messaging solution where multiple JMS clients attempt to receive JMS messages from a shared queue under transactional control, if more than one client is able to receive messages from the queue simultaneously, then in the event of a processing failure any uncommitted messages will again be made available to any client connected to the JMS provider. However the sequence of message appearance may be changed.
Similarly, in a publish/subscribe messaging topology (hereinafter referred to as “pub-sub”), a slow subscriber can cause an imbalance in the messaging system. For example, in IBM's WebSphere® MQ PubSub, all messages for a particular subscriber are stored on a particular queue. If the subscriber “draining” that queue is on a slow machine or has limited resources (i.e. single thread), it may not be able to drain the queue quick enough to keep up with the publish rate from the broker, which can cause the queue depth to grow producing unwanted increased latency and poor performance. In other non-queue based pub-sub environments, a slow subscriber can result in a delay to the whole system as a message is not considered delivered until all messages are delivered to all subscribers. Therefore a single slow subscriber can clog the whole system.
In addition, in current pub-sub implementations, there is no way to distribute that workload across multiple threads or machines. Also, there is currently no generalized solution that offers serialized access to a shared JMS queue between JMS clients receiving input from that queue, able of preserving the integrity of the message sequence on that queue in the event of a processing failure.
Thus, it would be desirable to provide a method and apparatus for implementing a distributed and reliable communication system for JMS clients connected to a single JMS provider that receive input from a shared resource or a queue while preserving the integrity of the message sequence in the shared resource or within such queue in the event of a processing failure.
A method and system for controlling access to a messaging system that exchanges messages in a distributed data processing system that includes a plurality of computing devices are disclosed. A method in one aspect may comprise sending a control message from a first computing device to a storage medium on a second computing device. The storage medium includes a first queue and the control message includes a token that corresponds to a shared resource of the messaging system. The token identifies a registering computing device that has an interest-to-use the shared resource. The method may also include storing the control message on the first queue and browsing for oldest control message in the first queue. The registering computing device associated with the oldest control message gains access to the shared resource.
A system for controlling access to a messaging system that exchanges messages in a distributed data processing system that includes a plurality of computing devices, in one aspect, may comprise a computing device having a storage medium. A plurality of client computing devices is operable to send a control message to the storage medium on the computing device. The storage medium includes a first queue and the control message includes a token that corresponds to a shared resource of the messaging system. The token identifies a registering computing device that has an interest-to-use the shared resource. The plurality of client computing devices is further operable to browse for oldest control message in the first queue. The registering computing device associated with the oldest control message gains access to the shared resource.
A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform methods described herein may be also provided.
Further features as well as the structure and operation of various embodiments are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements.
a illustrates an example of a point-to-point messaging system.
b illustrates an example of a publish-and-subscribe messaging system.
According to one embodiment of the present invention, a token is defined and associated with a shared resource—e.g., a JMS queue, or a topic in a pub-sub environment. All clients that wish to have access to this shared resource register their interest, and only one of them has sole access to the shared resource at any particular time, while all the other clients wait in a “hot standby” mode. Once a client has the controlling interest, it retains ownership of the shared resource until it either explicitly relinquishes control or terminates. In the event of a processing failure, any uncommitted messages will be recovered to the shared resource, and one of the standby clients will gain ownership of the shared resource and resume message processing.
To simplify the discussion of the present invention below, one embodiment of the present invention that utilizes the Java® programming language, and particularly the JMS API extension to the J2EE, is discussed in detail. The discussion of a the JMS API, or J2EE, below, however, is not intended to be read as a limitation on the present invention and those skilled in the art would readily understand how other embodiments could be built using other data structures.
In addition, one embodiment of the present invention utilizes underlying features of a MOM implementation, as provided, for example, through a JMS Provider (such as IBM's Websphere® MQ) to preserve the messages on queue. By doing so, one embodiment of the present invention reduces the necessary overhead, thereby creating a lightweight system, by removing redundant functionality that would otherwise be provided by the MOM implementation. For example, by marking the messages on queue as “persistent” by the message originator, additional overhead associated with message maintenance is avoided because the MOM implementation already has facilities to maintain a message when the message is marked as “persistent”. In addition, messages on the queue are under the Syncpoint control (another feature of a MOM implementation, such as IBM's Websphere® MQ) of the Shared Queue Controller (hereinafter referred to as a “SQC”), which guarantees that a message will not be removed from the queue until the SQC has explicitly issued a commit call to the MOM. Thus, in the event of a failure, the “get” of the message will not be committed to any client, which will therefore assure that messages are preserved in the original sequence.
An alternative embodiment to the present invention provides a generic solution for distributed platforms using some components described in the JMS specification. This method will allow JMS messages to be consumed from a JMS queue in sequential order by multiple JMS clients connected to the same JMS provider. Thus, an alternative embodiment of the present invention ensures that the original messages' sequence is maintained at all times, including the event of a processing failure in one of the clients.
Further embodiments of the present invention provides a generic, platform independent construct that performs the functions of the SQC as described above in any MOM implementation. Such SQC manages registrations from JMS clients that have interests in a shared JMS Queue or a pub sub provider.
Thus, in accordance with one aspect of the invention, there is provided a high availability communication method for sequential processing of JMS messages from a shared resource in a distributed environment by controlling access of multiple JMS clients, each using Shared Queue Controller, connecting to the same JMS provider.
Each of the SQC communicates with a Control Queue (hereinafter referred to as a “CQ”), e.g., CQ 240a. The CQ controls access to a shared resource; hence, CQ 240a controls access to SR 240b illustrated in
The operation of the CQ is described in detail below, but generally operates as a queue of CMs, where the CM at the end of the queue is granted access to the shared resource. Each CM, however, has a limited lifetime, and when its life expires, the CM is removed from the CQ. When a CM is removed from the CQ, the SQC (and similarly, the client) that sent the CM has lost its place in the queue, unless that SQC was the first to send a CM and the client is therefore accessing the shared resource. Otherwise, when the expired CM is not the first CM in the queue, the SQC (and its corresponding client) would no longer be given access to the shared resource if no other actions were performed by the SQC. Consequently, according to one embodiment of the present invention, each SQC re-transmits a CM to the CQ prior to the expiration of the CM to maintain the SQC's (and the client's) interest in the shared resource.
Although the contents of CQ 240a are not illustrated in
According to one embodiment of the present invention, each JMS client in a JMS Provider domain (e.g., CL1215, CL2225 and CL3235) that wants to consume messages from the shared resource (e.g., SR 240b) will create a new SQC (e.g., SQC 210, SQC 220 and SQC 230) and register an interest in the shared resource using the token identifier (discussed in further detail below) associated with the shared resource (e.g., SR 240b). For example, each JMS Client may indicate an interest in a shared resource by attempting to receive a message from the shared resource through the getNextMessage( ) method call (with the shared resource name as one of the arguments) accessing via the SQC. As described in further detail below, the getNextMessage( ) method registers an interest in the shared resource when the JMS Client does not own the shared resource and requests transmission of the next JMS message (a component of the Java® JMS API) when the JMS Client is the owner of the shared resource. In response to the getNextMessage ( ) method invocation when the JMS Client is not the owner of the shared resource, the SQC creates a CM to be transmitted to the CQ corresponding to the shared resource to register the JMS Client's interest in the shared resource. According to one embodiment of the present invention, each JMS Client with an interest in the shared resource will register an interest in the shared resource via CM objects sent to the CQ. Consequently, only one JMS client at a time will be allowed to consume messages from the SR (e.g., SR 240b), and this is the client that owns oldest registered CM (e.g., CM 210a) on the CQ (e.g., CQ 240a).
Furthermore, the shared resource offers high availability to its accessing clients in the embodiment of the present invention illustrated in
As illustrated,
Additionally, as shown in
In one aspect, shown in
Each SharedQueueController 320 may be implemented as a Java® class, according to one embodiment of the present invention. For example, the Java® class shown in Table 1 could implement SharedQueueController 320.
As mentioned previously, each JMS Client 310 registering an interest in a shared resource within a JMS Provider negotiates the terms of that registration via arguments passed to the ControlMessage 330 during its creation by the SharedQueueController 320. Furthermore, each JMS client 310 suggests a transaction model (e.g., whether the JMS Messages are to be preserved until an acknowledgement is received by the JMS Client) and expire interval, but ultimately yields to the settings of the current owning (e.g., oldest registered) JMS Client. By yielding to the configuration setting of the currently owning JMS Client, a registering JMS Client operates as a “hot standby” of the current owner of the shared resource. If a SharedQueueController 320 registering with a Shared Resource 350 does not agree to the terms used by the current JMS Client (e.g., transaction model and expiration interval), the registering SharedQueue Controller 320 can choose to deregister its interest (thereby losing its place in ControlQueue 340), or wait to obtain ownership (and establish another set of terms to interact with Shared Resource 350).
As shown in
Additionally, shown in
As mentioned above, ControlMessage 320 illustrated in
If a client terminates abnormally then the Control Messages from that client will no longer be replaced on the Control Queue and that client's interest in the shared JMS Queue will lapse. A JMS client can explicitly deregister its interest in a shared JMS queue via its Shared Queue Controller.
The JMS client that owns the oldest unexpired registration message (or owning the ControlMessage 320 with the oldest timestamp) on the Control Queue for a particular token identifier is considered to be the current owner of the shared JMS queue associated with that token. If the client is not the current owner it can wait for its chance to gain ownership by allowing its Shared Queue Controller to maintain the interest by continuing to send Control Messages to the Control Queue. The assignment of ownership is a collaborative act between each SharedQueueController in the system. The Shared Queue Controller asserts that it will not assume ownership of a shared resource if it does not own the control message with the oldest timestamp.
As mentioned above, owner of the SharedResource 350 is the JMS Client 310 that has sole access to receive messages from that the SharedResource 350 by virtue of its SharedQueueController 320 discovering its registration (represented by Control Message 330) is the oldest in the Control Queue 340 corresponding to the SharedResource 350. Thus, when the current JMS Client 310 owner of SharedResource 350 terminates or relinquishes its controlling interest then the JMS Client 310 with the next oldest registered interest (as represented by ControlMessage 330) will become the owner of SharedResource 350. A client's Shared Queue Controller detects when its client has acquired ownership of the shared queue, and will receive messages from that queue on behalf of the client.
Use of ControlMessage 330 objects, to refresh interest and preserve eventual access to SharedResource 350 (via ownership), can be controlled between multiple JMS Client 310 objects connecting to the same JMS provider in a distributed environment, as illustrated in
Continuing with the class collaboration diagram illustrated in
As discussed above, SharedQueueController 320 preserves the interest of JMS Client 310 in SharedResource 350 by sending ControlMessage 330 objects to ControlQueue 340. Since each ControlMessage 330 inherits JMS Message class properties, each ControlMessage 330 sets its JMS Expiration property to the TimeOut property (or time-to-live) value obtained from the oldest registered ControlMessage 330 on the ControlQueue 340 for the same registration TokenName. Based on the JMS Expiration property, as described above, each ControlMessage 330 will be automatically removed from the ControlQueue 340 by the JMS provider when its expiration time is exceeded. Thus, ControlQueue 340 is automatically cleared of ControlMessage 330 that are no longer relevant. In one embodiment, all SharedQueueController 320 registered for the same TokenName use the same TimeOut interval.
Thus, according to one embodiment of the present invention, SharedQueue Controller 320 for a JMS Client 310 will keep adding ControlMessage 330 to the ControlQueue 340 to maintain an interest in a TokenName (and its associated SharedResource 350). SharedQueueController 320 subsequently determines (e.g., via ControlQueueBrowser 320b) if its JMS Client 310 is the owner of the oldest ControlMessage 330 on the ControlQueue 340 for the registered TokenName. Upon such a determination, JMS Client 310 is then permitted to receive messages from the Shared Resource 350 (as indicated by CL1215a in
In
Although the embodiments discussed above illustrated the present invention in a typical MOM environment, one skilled in the art could, without undue experimentation, expand the descriptions above to other environments. For example, in
Moreover, as described above, each SQC creates a CMB (not shown in
In
In addition, the JMS Clients on Brokers 520 and 530 are waiting to obtain access to the SR 540b (as illustrated in reference characters 525 and 535). If, during the processing of one message, the message flow on Message Broker 510 experiences an unhandled error (or exception) and the flow stops unexpectedly, the SQC for the JMS Client on Message Broker 510 will also terminate. Consequently, since the SQC for Message Broker 510 terminated, CM objects from Message Broker 510 will no longer be sent the CQ 540a (as described above).
Once the CM objects from Message Broker 510 expire from the CQ 540a, in the example illustrated in
When the message flow on Message Broker 510 resumes, however, the JMS Client may again create a SQC and registers an interest in the SR 540b (as described above). Since Message Broker 510 lost its place in CQ 540a, it will have the youngest registration timestamp in CQ 540a. Therefore, the example shown in
Additionally, each SQC creates a CMB (not shown in
In
The JMS Client in execution group ex1622 continues to receive JMS Messages from SR 610b, until it finds the last message in the JMS Message group. As described above, ex1 622 calls the commit( ) method (e.g., as described in Table 2) on the SQC followed by the deregister( ) method (e.g., as described in table 2). Furthermore, after calling the deregister( ) method, SQC closes both the CMH and CMB. Consequently, the CM object(s) for execution group ex1622 will expire from CQ 610a, indicating SR 610b is available to the next available client. For example, in
If the JMS Client on execution group ex1622 again registers its interest in the SR 610b, the corresponding CM 622a will have the youngest registration timestamp on CQ 610a. Therefore, ex1622 will wait for the message flows on execution group ex2624 to stop and ex2624 deregisters its interest (as described above) before ex1622 can resume receiving input.
Note that this particular embodiment does not require that the group messages be placed back on queue in their original sequence in the event of a failure, since the messages will be consumed or recovered as a single logical unit. In this case it is the completeness of the group message that is of concern and not where that group is placed on the queue.
While the proceeding paragraphs describe a control queue (e.g., the CM) that regulates access to a shared resource, a software component that manages a client's interest in the shared resource (e.g., the SQC) and a registration mechanism (e.g., the generation handling of CM objects) in a point-to-point message topology within a MOM framework, and other configurations are possible.
As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Any combination of one or more computer usable or computer readable medium(s) may be utilized. The computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CDROM), an optical storage device, a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device. Note that the computer-usable or computer-readable medium could even be paper or another suitable medium, upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. In the context of this document, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer-usable medium may include a propagated data signal with the computer-usable program code embodied therewith, either in baseband or as part of a carrier wave. The computer usable program code may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring now to
The computer program product may comprise all the respective features enabling the implementation of the methodology described herein, and which—when loaded in a computer system—is able to carry out the methods. Computer program, software program, program, or software, in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
The computer processing system that carries out the system and method of the present disclosure may also include a display device such as a monitor or display screen 704 for presenting output displays and providing a display through which the user may input data and interact with the processing system, for instance, in cooperation with input devices such as the keyboard 806 and mouse device 708 or pointing device. The computer processing system may be also connected or coupled to one or more peripheral devices such as the printer 710, scanner (not shown), speaker, and any other devices, directly or via remote connections. The computer processing system may be connected or coupled to one or more other processing systems such as a server 710, other remote computer processing system 714, network storage devices 712, via any one or more of a local Ethernet, WAN connection, Internet, etc. or via any other networking methodologies that connect different computing systems and allow them to communicate with one another. The various functionalities and modules of the systems and methods of the present disclosure may be implemented or carried out distributedly on different processing systems (e.g., 702, 714, 718), or on any single platform, for instance, accessing data stored locally or distributedly on the network.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements, if any, in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Various aspects of the present disclosure may be embodied as a program, software, or computer instructions embodied in a computer or machine usable or readable medium, which causes the computer or machine to perform the steps of the method when executed on the computer, processor, and/or machine. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform various functionalities and methods described in the present disclosure is also provided.
The system and method of the present disclosure may be implemented and run on a general-purpose computer or special-purpose computer system. The computer system may be any type of known or will be known systems and may typically include a processor, memory device, a storage device, input/output devices, internal buses, and/or a communications interface for communicating with other computer systems in conjunction with communication hardware and software, etc.
The terms “computer system” and “computer network” as may be used in the present application may include a variety of combinations of fixed and/or portable computer hardware, software, peripherals, and storage devices. The computer system may include a plurality of individual components that are networked or otherwise linked to perform collaboratively, or may include one or more stand-alone components. The hardware and software components of the computer system of the present application may include and may be included within fixed and portable devices such as desktop, laptop, server. A module may be a component of a device, software, program, or system that implements some “functionality”, which can be embodied as software, hardware, firmware, electronic circuitry, or etc.
The embodiments described above are illustrative examples and it should not be construed that the present invention is limited to these particular embodiments. Thus, various changes and modifications may be effected by one skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.