Cluster architecture having a star topology with centralized services

Information

  • Patent Grant
  • 8190780
  • Patent Number
    8,190,780
  • Date Filed
    Tuesday, December 30, 2003
    20 years ago
  • Date Issued
    Tuesday, May 29, 2012
    12 years ago
Abstract
A cluster of architecture having a star topology with the central services node at its center. Application server instances are organized in a star topology with the central services node at its center. The central services node may include services such as a messaging server for interinstance cluster communications. A locking server may also be provided to provide cluster wide locking to facilitate changes and updates within the cluster. A database may also be shared by all instances in the cluster, thereby reducing the need for data replication. In one embodiment, the message server has no persistent state. In such an embodiment, if the message server fails, it can merely be restarted without any state recovery requirement.
Description
BACKGROUND

1. Field of the Invention


The invention relates to cluster architecture. More specifically, the invention relates to a star topology cluster architecture for application servers.


2. Background


Traditionally, each instance in a cluster of application servers maintains a direct communication link with each other instance. In the cluster, as used herein, instance refers to a unit in the cluster, which can be started, stopped, and monitored separately from other units in the cluster. As the cluster becomes increasingly large, the overhead associated with maintaining the communication link from each instance to every other instance becomes quite burdensome. Resulting performance problems have created a real constraint on cluster size. Additionally, to maintain the homogeneity of the cluster, large amounts of data are replicated and passed around the cluster over these myriad communication links. This overhead severely impacts the scalability of such cluster architectures.


SUMMARY

A cluster of architecture having a star topology with a central services node at its center is described. Application server instances are organized in the star topology with the central services node at its center. The central services node may include services such as a messaging server for interinstance cluster communications. A locking server may also be provided to provide cluster wide locking to facilitate changes and updates of data within the cluster or to synchronize concurrent processes. A database may also be shared by all instances in the cluster, thereby reducing the need for data replication. In one embodiment, the message server has no persistent state. In such an embodiment, if the message server fails, it can merely be restarted without any state recovery requirement.





BRIEF DESCRIPTION OF THE DRAWINGS

The invention is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean at least one.



FIG. 1 is a block diagram of a system employing a cluster architecture of one embodiment of the invention.



FIG. 2 is a fundamental modeling concept (FMC) diagram of dispatcher/server process interaction.



FIG. 3 is a flow diagram of a cluster establishment in one embodiment of the invention.



FIGS. 4 and 5 illustrate one exemplary startup method and framework, respectively.





DETAILED DESCRIPTION


FIG. 1 is a block diagram of a system employing a cluster architecture of one embodiment of the invention. A cluster includes central services node 102 which is organized in star topology with one or more application server instances 104 (104-1 to 104-M) with the central services node 102 at the center of the star topology. Application server instances 104 received requests from one or more clients for example, via web browsers 108 (108-1 to 108-N) over a distributed network 106. In one embodiment, the network 106 may be a distributed network such as the Internet. In one embodiment, requests from the client web browser 108 may be transmitted using hypertext transfer protocol (HTTP). In one embodiment, the cluster is a Java 2 enterprise edition (J2EE) cluster.


As previously noted, as used herein, each “instance” is a unit in the cluster, which can be started, stopped and monitored separately. In one embodiment, each instance runs on a physical server, but more than one instance may run on a single physical server. In one embodiment, an instance may be identified within the cluster by a system identification number and an instance number. In one embodiment, the central services node is an example of a J2EE instance.


An instance typically contains at least one server process 124. More commonly, an instance includes a dispatcher 122 and several server processes 124. The dispatcher 122 and the server process 124 are described in greater detail below in connection with FIG. 2. It is also contemplated that more than one dispatcher may reside in a single instance.


In one embodiment, central services node 102 includes a message server 110, lock server 112 and a shared database 114. The message server 110 maintains a client list 116. The client list 116 includes a listing of the dispatchers 122 and server processes 124 of the cluster. Client list 116 should not be confused with a list of client web browsers 108, but rather, is a list of the clients of the message server. The message server also maintains a service list 118, which lists the services available within the cluster such as, for example, an HTTP service.


Message server 110 is responsible for interinstance communication. For example, if a server process 124 of instance 104-1 wishes to send a message 126 to instance 104-M, a message 126 is sent via the message server 110 which provides the infrastructure for interinstance communication. In one embodiment, each server process 124 and each dispatcher 122 has a link through which it can communicate with the message server, and through which the message server may send messages notifying of events within the cluster. Message server 110 also supplies a dispatcher 122 with information to facilitate load balancing between various instances in the cluster. As noted above, the message server 110 also provides notification of events that arise within a cluster, for example, failure or shutdown of an instance or when a service is started or stopped. Because the message server may represent a single point of failure in the cluster, it should support failover to be effectively used in high availability systems. To that end, in one embodiment, the message server has no persistent state such that if the message server fails, it need merely be restarted and then re-register instances in the cluster without performing any state recovery procedures.


Lock server 112 is used for internal synchronization of the cluster. Applications may lock objects and then release the objects subsequently. This is particularly true in the context of updating the cluster, with for example, deployment of a new application. Lock server 112 maintains a lock table 120 in which it manages logical database locks to prevent inconsistent modifications of, for example, database 114. In one embodiment, the lock table is maintained in main memory (not shown). In one embodiment, lock table 120 maps logical locks to data that is contained in database 114. The lock sever 112 may also represent a single point of failure within the cluster. For high availability systems, a replication server may be provided which maintains a real time replica of the state of the lock server 112. After lock server failure, the lock server 112 may be restarted with the replication of the state from the replication server.



FIG. 2 is a fundamental modeling concept (FMC) diagram of dispatcher/server process interaction. Initially, the client 208 requests connection from the connection request handler 202 using, for example, HTTP. A load balancer 204 provides an indication of which server process 124 is going to handle the client response request from the connection request handler 202. In one embodiment, load balancer 204 uses a weighted round robin procedure. The connection request handler 202 initializes a connection object, which is then assigned to the client 208 after which the client will have connection to the connection manager 206 going forward. Client 108-1 and 108-M are shown with connections to connection manager 206 already in place. Once the connection request handling 202 creates the connection object and establishes the connection, connection manager 206 identifies what session level services 210, such as HTTP or RMI (remote method in location) for example, will be used. All requests from the same client are sent to the same server process 124 thereafter. The connection manager 206 then forwards the request including the session level services to the communication handler 212 where they may be queued in request queue 214 until thread manager 213 provides a worker thread 215 to service the requests to be sent to the server process 124.


Request to the server process 124 in one embodiment, may be sent using transmission control protocol/internet protocol (TCP/IP). Communications handler 218 in the server process receives the request from the dispatcher 122 and queues it in the request queue 226 until thread manager 222 provides a worker thread 224 to service the request. Session level services 210, which may be the same as those assigned in the dispatcher node 122, are applied and the thread attempts to service the request using the application/application level service 220. Because each server processes is multi-threaded, a large number of requests may be serviced simultaneously. The response to the request is sent back through the dispatcher 122 over the connection established within the connection manager 206 to the client 108.



FIG. 3 is a flow diagram of a cluster establishment in one embodiment of the invention. At block 302, the central services node (CSN) is started. In one embodiment, the central services are provided by the central services node started on a physical server with its own system number and the system identification number of the whole system. Once the central services node is up and running, additional instances are started at block 304. One manner in which additional instances may be started is described with reference to FIGS. 4 and 5 below. The dispatcher and server processes of the various instances register with the central services node at block 306. In one embodiment, a message server in the central services node maintains a list of connected clients including a system ID and instance number for each dispatcher and server process running in the cluster. At block 308, the central services node notifies all current cluster registrants as each additional instance joins the cluster. This may include notifying the cluster constituents of services available within the cluster start up, including any new services deployed as a result of additional instances during a cluster or additional application deployment.


A determination is made at decision block 309 if a failure has occurred in the central services node. Failure may be the result of, for example, the message server failing, lock server failure or database failure. If a failure has occurred, a determination is made at block 310 if the database has failed. If it has, the database is recovered according to procedures established by the database vendor at block 312. If the database has not failed, or after recovery, a determination is made at decision block 314, if the central services node lock server has failed. If it has, lock server is restarted using a replica from the replication server at block 316. A determination is then made at decision block 318 when the message server has failed. If the message server has failed, the message server is restarted at block 420. Once the message server has restarted, registration of the instances can recommence at block 306 as previously described. If the central services node has not failed, failure is awaited at block 309. While FIG. 3 depicts a flow diagram, in various embodiments certain operations may be conducted in parallel or in a different order than depicted. Accordingly, different ordering and parallelism are within the scope of the invention.


One embodiment of the invention employs a unique startup framework for starting and stopping the various server instances within the cluster. FIGS. 4 and 5 illustrate one exemplary startup method and framework, respectively. The startup framework 900 includes startup and control logic 902 and bootstrap logic 901. In one embodiment, the startup and control logic 902 provides the central point of control for the instance 104 and for all processes 903 executed within the servers and dispatchers of the instance 104. For example, the instance startup procedure described herein is performed under the control of the startup and control logic 902.


Turning to the method in FIG. 4, at 800, the startup and control program 902 launches the bootstrap logic 901 to initiate startup of the instance (e.g., in response to a startup command entered by a network administrator). As illustrated in FIG. 5, the bootstrap logic 901 is comprised of bootstrap binaries 513 and is configured based on bootstrap configuration parameters 512 stored within the configuration data hierarchy 420 of the central database 114. Thus, if necessary, the bootstrap logic 901 may be modified/updated at a single, central location and subsequently distributed to servers/instances upon request.


At 802, the bootstrap logic retrieves up-to-date configuration data 420 from the central database 114 including the layout of the instance 104 (e.g., identifying the servers and dispatchers to be started) and the parameters/arguments to be used for each server and dispatcher within the instance 104. In one embodiment, the bootstrap logic 901 uses this information to construct a description of the instance, which it provides to the startup and control logic 902 in the form of an “Instance Properties” data object. In one embodiment, the Instance Properties data object is simply a text file containing a list of servers/dispatchers and associated parameters which the startup and control logic parses to determine the instance layout and settings. However, various alternate data formats may be employed for the Instance Properties file while still complying with the underlying principles of the invention (e.g., such as the “Extensible Markup Language” or “XML” format).


At 804, using the instance layout and configuration data from the Instance Properties data object, the startup and control logic 902 builds a list of servers and/or dispatchers to be started, including an indication of the specific instance processes to be started on each server/dispatcher. In one embodiment of the invention, prior to starting each server/dispatcher identified in the list, the startup and control logic 902 launches node-specific bootstrap logic 905, at 806, to synchronize the binaries and configuration settings on each server and/or dispatcher. For example, if the instance 104 was inoperative for a period of time, the global and/or server/dispatcher-specific binaries and configuration settings 900 may have changed within the central database 114. Accordingly, in one embodiment, when the node-specific bootstrap logic 905 is launched, it compares the binaries and configuration settings stored in the local file system of the server/dispatcher being started to the binaries and configuration settings 900 stored in the central database 114. In one embodiment, the comparison is performed between an index of the data stored locally on the server/dispatcher and an index of the data stored within the central database 420 (e.g., an index built from the hierarchical structure illustrated in FIG. 5).


Regardless of how the comparison is performed, if the binaries and/or configuration settings stored within the local file system 904 of the dispatcher/server are out-of-date, then the current binaries and/or configuration settings 900 are retrieved from the central database 114 and stored within the local file system 904 of the dispatcher/server. In one embodiment, only the binaries and/or configuration settings which are new are copied to the local file system 904, thereby conserving network bandwidth and server resources.


Once synchronization is complete, at 808, the startup and control logic executes the processes on each of the servers using arguments/parameters included within the Instance Properties data object. At 810, the startup and control logic initializes the service framework and services on the servers/dispatchers within the instance 104. For example, the service framework and services are the J2EE service framework and J2EE services, respectively, in a Java environment. However, various other types of services/frameworks may be employed while still complying with the underlying principles of the invention.


Embodiments of the invention may include various operations as set forth above. The operations may be embodied in machine-executable instructions which cause a general-purpose or special-purpose processor to perform certain operations. Alternatively, these operations may be performed by specific hardware components that contain hardwired logic for performing the operations, or by any combination of programmed computer components and custom hardware components.


Elements of the present invention may also be provided as a machine-readable medium for storing the machine-executable instructions. The machine-readable medium may include, but is not limited to, flash memory, optical disks, CD-ROMs, DVD ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, propagation media or other type of machine-readable media suitable for storing electronic instructions. For example, the present invention may be downloaded as a computer program which may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection).


In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however be evident that various modifications and changes can be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims
  • 1. A system comprising: a message server that is configured to be restarted after a failure without performing state recovery operations; anda plurality of instances of an application server coupled in a star topology with the message server at a center of the star topology, the message server handling communications between the plurality of instances of the application server, one or more of the plurality of instances to register or reregister instance-specific information with the message server upon a starting or restarting, respectively, of the message server, the instance-specific information identifying one or more services that the one or more of the plurality of instances is configured to provide to each of the plurality of instances.
  • 2. The system of claim 1 wherein each of the plurality of instances comprises: a dispatcher node; anda plurality of server nodes.
  • 3. The system of claim 2 wherein each of the plurality of server nodes comprises a java 2 enterprise edition (J2EE) engine.
  • 4. The system of claim 1 further comprising a central lock server to provide cluster wide locks to the plurality of instances.
  • 5. The system of claim 1 wherein the message server comprises: a first data structure to store a list of connected clients; anda second data structure to store a list of services provided in the system.
  • 6. The system of claim 1, wherein each of the plurality of instances is started using a first instance-specific bootstrap logic, the first instance-specific bootstrap logic synchronized with a second instance-specific bootstrap logic stored in the database.
  • 7. The system of claim 1, wherein a node within the plurality of instances is started using a first node-specific bootstrap logic, the first node-specific bootstrap logic synchronized with a second node-specific bootstrap logic stored in the database.
  • 8. The system of claim 1, wherein the plurality of instances are unable to communicate with each other during a failure of the message server.
  • 9. A non-transitory computer readable storage media containing executable computer program instructions which when executed cause a digital processing system to perform a method comprising: starting a central services node to provide a locking service and a messaging service, the messaging service configured to be restarted after a failure without performing state recovery operations, the messaging service handling communications between a plurality of application server instances;starting the plurality of application server instances;organizing the application server instances into a cluster having star topology with the central services node at a center of the star topology; andregistering or reregistering instance-specific information with the central services node upon starting or restarting, respectively, of the central services node, the registering or registering initiated by one or more of the plurality of application server instances, the instance-specific information identifying one or more services that the one or more of the plurality of application server instances are configured to provide to each of the plurality of application server instances.
  • 10. The non-transitory computer readable storage media of claim 9, the method further comprising sharing a database among the plurality of application server instances.
  • 11. The non-transitory computer readable storage media of 9, the method wherein starting a plurality of application server instances comprises starting, for each application server instance of the plurality, a dispatcher node and a plurality of server nodes.
  • 12. The non-transitory computer readable storage media of claim 9, the method further comprising starting a message server having no persistent state.
  • 13. The non-transitory computer readable storage media of claim 12, the method further comprising restarting the message server without state recovery responsive to a system failure.
  • 14. The non-transitory computer readable storage media of claim 9, the method further comprising conducting inter instance communication through the messaging service.
  • 15. A system comprising: means for organizing a plurality of application servers instances into a cluster having a star topology with a central services node at a center of the star topology;means for sharing a storage resource across the cluster; andmeans for performing centralized inter instances communication that is configured to be restarted after a failure without performing state recovery operations, the inter instances communication including registering or reregistering of instance-specific information with the central services node upon a starting or restarting, respectively, of the central services node, the registering or reregistering initiated by one or more of the plurality of application server instances, the instance-specific information identifying one or more services, that the one or more of the plurality of application server instances are configured to provide to each of the plurality of application server instances.
  • 16. The system of claim 15 further comprising means for centrally locking a resource within the cluster.
  • 17. The system of claim 15 wherein the means for performing comprises a message server having no persistent state.
  • 18. The system of claim 15 wherein the means for performing comprises means for recording services provided in the cluster.
  • 19. A method comprising: starting a central services node to provide a locking service and a messaging service, the messaging service being configured to be restarted after a failure without performing state recovery operations, the messaging service handling communications between a plurality of application server instances;starting the plurality of application server instances;organizing the plurality of application server instances into a cluster having a star topology with the central services node at a center of the star topology; andregistering or reregistering instance-specific information with the central services node upon a starting or a restarting, respectively, of the central services node, the instance-specific information identifying one or more services the one or more of the plurality of application server instances are configured to provide to each of the plurality of application server instances.
  • 20. The method of claim 19 further comprising sharing a database among the plurality of application server instances.
  • 21. The method of claim 19 wherein starting a plurality of application server instances comprises starting, for each instance of the plurality, a dispatcher node and a plurality of server nodes.
  • 22. The method of claim 19 wherein the messaging service has no persistent state.
  • 23. The method of claim 22 further comprising restarting the message server without state recovery responsive to a system failure.
  • 24. The method of claim 19 further comprising conducting inter instance communication through the messaging service.
  • 25. The system of claim 19, further comprising notifying each of the plurality of application server instances of the registering or reregistering of the instance-specific information.
  • 26. The method of claim 19, wherein the instance-specific information further includes information about a new service that the one or more of the plurality of instances provide.
US Referenced Citations (126)
Number Name Date Kind
4939724 Ebersole Jul 1990 A
5263155 Chung Nov 1993 A
5301337 Wells et al. Apr 1994 A
5553239 Heath et al. Sep 1996 A
5613139 Brady Mar 1997 A
5778356 Heiny Jul 1998 A
5991893 Snider Nov 1999 A
6003075 Arendt et al. Dec 1999 A
6014669 Slaughter et al. Jan 2000 A
6081807 Story et al. Jun 2000 A
6088516 Kreisel et al. Jul 2000 A
6151649 Liong et al. Nov 2000 A
6182186 Daynes Jan 2001 B1
6212610 Weber et al. Apr 2001 B1
6216237 Klemm et al. Apr 2001 B1
6253273 Blumenau Jun 2001 B1
6263456 Boxall et al. Jul 2001 B1
6300948 Geller et al. Oct 2001 B1
6363411 Dugan et al. Mar 2002 B1
6366915 Rubert et al. Apr 2002 B1
6377950 Peters et al. Apr 2002 B1
6438560 Loen Aug 2002 B1
6438705 Chao et al. Aug 2002 B1
6484177 Van Huben et al. Nov 2002 B1
6490625 Islam et al. Dec 2002 B1
6501491 Brown et al. Dec 2002 B1
6523035 Fleming et al. Feb 2003 B1
6557009 Singer et al. Apr 2003 B1
6564261 Gudjonsson et al. May 2003 B1
6594779 Chandra et al. Jul 2003 B1
6604209 Grucci et al. Aug 2003 B1
6622155 Haddon et al. Sep 2003 B1
6647301 Sederlund et al. Nov 2003 B1
6654914 Kaffine et al. Nov 2003 B1
6658018 Tran et al. Dec 2003 B1
6678358 Langsenkamp et al. Jan 2004 B2
6823358 Spender Nov 2004 B1
6886035 Wolff Apr 2005 B2
6892231 Jager et al. May 2005 B2
6904544 DeRolf et al. Jun 2005 B2
6934768 Block et al. Aug 2005 B1
6944829 Dando Sep 2005 B2
6950874 Chang et al. Sep 2005 B2
6954757 Zargham et al. Oct 2005 B2
6973473 Novaes et al. Dec 2005 B1
6983324 Block et al. Jan 2006 B1
7047497 Patrizio et al. May 2006 B2
7051316 Charisius et al. May 2006 B2
7058601 Paiz Jun 2006 B1
7072934 Helgeson et al. Jul 2006 B2
7080060 Sorrentino et al. Jul 2006 B2
7093247 Ashworth et al. Aug 2006 B2
7114170 Harris et al. Sep 2006 B2
7127472 Enokida et al. Oct 2006 B1
7155466 Rodriguez et al. Dec 2006 B2
7165189 Lakkapragada et al. Jan 2007 B1
7181731 Pace et al. Feb 2007 B2
7185071 Berg et al. Feb 2007 B2
7197533 Vincent et al. Mar 2007 B2
7203700 Kumar et al. Apr 2007 B1
7206805 McLaughlin, Jr. Apr 2007 B1
7209918 Li Apr 2007 B2
7216257 Kilian May 2007 B2
7228369 Nakamura Jun 2007 B2
7251662 Behman et al. Jul 2007 B2
7302609 Matena et al. Nov 2007 B2
7316016 DiFalco Jan 2008 B2
20010023440 Franklin et al. Sep 2001 A1
20010049726 Comeau et al. Dec 2001 A1
20020029334 West et al. Mar 2002 A1
20020069272 Kim et al. Jun 2002 A1
20020069378 McLellan et al. Jun 2002 A1
20020078132 Cullen et al. Jun 2002 A1
20020078213 Chang et al. Jun 2002 A1
20020091969 Chen et al. Jul 2002 A1
20020120717 Giotta et al. Aug 2002 A1
20020124082 San Andres et al. Sep 2002 A1
20020161750 Rajarajan et al. Oct 2002 A1
20020161870 French et al. Oct 2002 A1
20020165727 Greene et al. Nov 2002 A1
20030014552 Vaitheeswaran et al. Jan 2003 A1
20030018887 Fishman et al. Jan 2003 A1
20030037148 Pedersen Feb 2003 A1
20030051195 Bosa et al. Mar 2003 A1
20030093420 Ramme May 2003 A1
20030120760 Fortin et al. Jun 2003 A1
20030126136 Omoigui Jul 2003 A1
20030154239 Davis et al. Aug 2003 A1
20030167298 Bazinet et al. Sep 2003 A1
20030167331 Kumar et al. Sep 2003 A1
20030187991 Lin et al. Oct 2003 A1
20030200321 Chen et al. Oct 2003 A1
20030204552 Zuberi Oct 2003 A1
20030217092 Veselov Nov 2003 A1
20030236923 Jeyaraman et al. Dec 2003 A1
20030236925 Balek et al. Dec 2003 A1
20040006614 DiFalco Jan 2004 A1
20040010502 Bomfim et al. Jan 2004 A1
20040015968 Neiman et al. Jan 2004 A1
20040019639 E et al. Jan 2004 A1
20040019662 Viswanath et al. Jan 2004 A1
20040073782 Price et al. Apr 2004 A1
20040128370 Kortright Jul 2004 A1
20040146056 Martin et al. Jul 2004 A1
20040148183 Sadiq Jul 2004 A1
20040148369 Strassner Jul 2004 A1
20040153558 Gunduc et al. Aug 2004 A1
20040153639 Cherian et al. Aug 2004 A1
20040249904 Moore et al. Dec 2004 A1
20040249918 Sunshine Dec 2004 A1
20040254584 Sarin et al. Dec 2004 A1
20040260748 Springer et al. Dec 2004 A1
20040260773 Springer Dec 2004 A1
20050005200 Matena et al. Jan 2005 A1
20050015471 Zhang et al. Jan 2005 A1
20050055686 Buban et al. Mar 2005 A1
20050108395 Brey et al. May 2005 A1
20050114315 Tanner et al. May 2005 A1
20050144507 Lee et al. Jun 2005 A1
20050251500 Vahalia et al. Nov 2005 A1
20050268294 Petev et al. Dec 2005 A1
20060001262 Bormioli Jan 2006 A1
20060010438 Brady, Jr. et al. Jan 2006 A1
20060053330 Pomaranski et al. Mar 2006 A1
20060106590 Tseng et al. May 2006 A1
20090019139 Strassner Jan 2009 A1
Related Publications (1)
Number Date Country
20050188021 A1 Aug 2005 US