1. Field of the Invention
This invention relates to computer systems and, more particularly, to high availability and scalability of applications operating within clustered computer systems.
2. Description of the Related Art
Enterprises have become increasingly dependent on information technology applications for the success of their businesses. It has become critical for these applications to be available to employees, partners, and/or customers around the clock. In addition, it is desirable for these applications to scale to large numbers of users. Consequently, various strategies have been employed to increase the availability and scalability of applications. One strategy has been to deploy applications on multiple host computers. For example, each computer that hosts an application may be configured with one or more redundant failover computers to take its place in the event of a failure. Another strategy is to deploy applications that are distributed on a group of hosts, commonly referred to as a computer cluster. Computer clusters use multiple computers interconnected by a network to provide the services of a single larger computer. The individual hosts in a computer cluster may share the application's load and serve as failover hosts in the event any of the hosts fails or becomes overloaded.
In order to increase the effectiveness of the above strategies, it is desirable for failover hosts and cluster members to be configured such that there are as few single points of failure as possible among the members. For example, if two hosts share a power supply, a network connection, or some other critical resource, they are not good candidates to be primary and secondary hosts in a failover pairing. More generally, it may be desirable to configure applications among hosts that are separated geographically as much as possible. Geographic separation may include placing hosts in different enclosures, rooms in a building, different buildings, different cities, etc. to avoid single points of failure.
Unfortunately, in many distributed applications, hosts may be identified by a network address such as an IP address that conveys little, if any, geographic information. In addition, applications may be deployed in a virtualized environment in which hosts are arranged in computer clusters. Re-assignments lead to dynamic changes in the physical locations of hosts as the virtualization system performs load balancing and other tasks. Determination of the physical location of a host may be complicated by the above factors.
In view of the above, an effective system and method for assigning hosts to applications that results in high availability and scalability of the applications that accounts for these issues are desired.
Various embodiments of a computer system and methods are disclosed. In one embodiment, a computer system includes a plurality of hosts, a cluster manager, and a cluster database. The cluster database includes entries corresponding to the hosts, each entry including data identifying a physical location of a corresponding host. The cluster manager uses the data identifying a physical location of a corresponding host to select at least two hosts and assign the selected hosts to a service group for executing an application.
In a further embodiment, the cluster manager selects hosts via a location-based algorithm that determines which hosts are least likely to share a single point of failure. In a still further embodiment, the data identifying a physical location of a corresponding host includes a hierarchical group of location attributes describing two or more of a host's country, state, city, building, room, enclosure, and RFID. The location-based algorithm identifies a group of selected hosts whose smallest shared location attribute is highest in the hierarchical group.
In a still further embodiment, the system updates the data identifying a physical location of a corresponding host in the cluster database in response to detecting that a physical location of a host has changed. In a still further embodiment, at least some of the hosts are virtual hosts in a virtualized environment and a physical location of each virtual host may change dynamically during host operation. In a still further embodiment the at least two hosts include a primary host and a secondary host. The primary host is configured to execute at least a portion of an application and the secondary host is configured to execute the at least a portion of the application in response to an indication that the primary host has failed. In a still further embodiment, the service group includes two or more load-balancing hosts that share tasks associated with an application.
These and other embodiments will become apparent upon consideration of the following description and accompanying drawings.
While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
In alternative embodiments, system 100 may include a different number of regions, enclosures, and/or hosts as needed to support a variety of high availability and highly scalable applications. Hosts may be grouped in a variety of ways to form computing clusters depending on the needs of the applications that are supported. The hosts that are included in a cluster may be physically located in the same enclosure or in different enclosures in the same region, or in different regions.
During operation, virtualization may be implemented on any of hosts 131-139, 151-159, and 171-179. Accordingly, each of hosts 131-139, 151-159, and 171-179 may include one or more virtual hosts. Distributed applications may be executed computer clusters consisting of the virtual hosts the are included in physical hosts 131-139, 151-159, and 171-179.
Virtualization systems 220, 240, and 260 may be any of a variety of systems that manage the resources provided by host hardware and provide virtual machines on which one or more applications may be executed. Applications that may be executed on the provided virtual machines include database applications, email systems, collaboration systems and the like. Cluster servers 224, 244, and 264 may be instances of any of a variety of software products for managing virtualized computer clusters such as VCSOne from Symantec Corporation, etc. During operation, virtualization systems 220, 240, and 260 provide resources that cluster servers 224, 244, and 264 provision as clusters of nodes, where each node provides computing functionality for one or more applications. Nodes may be organized as service groups providing redundancy to increase availability and scalability of the applications. Service groups may include as few as two nodes operating as a primary computing element and a failover computing element Service groups may also include much larger arrays of redundant nodes on which an application may be distributed. Cluster servers 224, 244, and 264 may maintain records of the nodes that are in use and/or available within cluster database 270.
Each node entry includes a set of attributes, as shown at the bottom of
In one embodiment, group ID field 371 may include an identifier for the group of which the corresponding node is a member. Host name field 372 may include a name for the corresponding host that is recognizable to a user or system administrator. IP address 373 may include an IP address to be used to communicate with the corresponding node. Status field 374 may include data indicating whether or not a corresponding node is operating, has connectivity, is backed up by a failover target, is a failover target for another node, etc. Items that are included in status field 374 may be determined according to the needs of individual clusters and/or by cluster manager 310. Failover target field may include data identifying a failover target for the corresponding node, such as an IP address, hostname, or other identifiers. Location field 376 may include data that specifies a physical location of the corresponding node. An example of one embodiment of physical location data is presented below.
During operation, whenever a new service group is created, cluster manager 310 may create a corresponding group list in cluster database 270 and populate a node entry for each node that is a member of the service group. Cluster manager 310 may also update each group list whenever there are membership changes in the corresponding service group. For example, if a service group includes virtual machines in a virtualized environment, group lists may be updated by cluster manager 310 in response to changes in the location of any virtual machines that are members of the group. Also, cluster manager 310 may update group lists if a node fails. In one embodiment, the cluster manager may send a heartbeat signal to each node in each cluster to monitor cluster node status. If a particular node does not respond to the heartbeat signal, cluster manager 310 may update the status of the particular node in each group list of which the node is a member. In addition, cluster manager 310 may update group lists in response to a user input, a command, or on a periodic basis according to a schedule, etc. User inputs or commands may optionally include a request to reconfigure one or more groups. These and other update circumstances will be apparent to one of ordinary skill in the art.
At various times, such as when a new service group is created or updated, cluster manager 310 may select particular nodes for membership in particular service groups so as to maximize service availability and/or scalability or to minimize single points of failure in the service groups. For example, cluster manager 310 may select two nodes for membership in a redundant pair from a group of nodes such that the physical locations of the selected nodes have the greatest separation of all available pairs of nodes. In more complex service groups, cluster manager 310 may use other algorithms to minimize common physical locations among the nodes. Cluster manager 310 may apply a set of rules for selecting nodes. For example a rule may specify that two nodes that are located in the same enclosure may not be assigned to the same service group. Various other selection rules are possible and are contemplated.
It is noted that the foregoing flow charts are for purposes of discussion only. In alternative embodiments, the elements depicted in the flow charts may occur in a different order, or in some cases concurrently. Additionally, some of the flow chart elements may not be present in various embodiments, or may be combined with other elements. All such alternatives are contemplated.
It is noted that the above-described embodiments may comprise software. In such an embodiment, the program instructions that implement the methods and/or mechanisms may be conveyed or stored on a computer readable medium. Numerous types of media which are configured to store program instructions are available and include hard disks, floppy disks, CD-ROM, DVD, flash memory, Programmable ROMs (PROM), random access memory (RAM), and various other forms of volatile or non-volatile storage.
Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.
Number | Name | Date | Kind |
---|---|---|---|
5247464 | Curtis | Sep 1993 | A |
5852724 | Glenn et al. | Dec 1998 | A |
5946685 | Cramer et al. | Aug 1999 | A |
6230246 | Lee et al. | May 2001 | B1 |
6353898 | Wipfel et al. | Mar 2002 | B1 |
6360331 | Vert et al. | Mar 2002 | B2 |
6389433 | Bolosky et al. | May 2002 | B1 |
6421777 | Pierre-Louis et al. | Jul 2002 | B1 |
6427163 | Arendt et al. | Jul 2002 | B1 |
6438642 | Shaath | Aug 2002 | B1 |
6438705 | Chao et al. | Aug 2002 | B1 |
6493811 | Blades et al. | Dec 2002 | B1 |
6513051 | Bolosky et al. | Jan 2003 | B1 |
6526521 | Lim | Feb 2003 | B1 |
6587959 | Sjolander et al. | Jul 2003 | B1 |
6624750 | Marman et al. | Sep 2003 | B1 |
6629266 | Harper et al. | Sep 2003 | B1 |
6738345 | Williamson | May 2004 | B1 |
6754664 | Bush | Jun 2004 | B1 |
6754781 | Chauvel et al. | Jun 2004 | B2 |
6763479 | Hebert | Jul 2004 | B1 |
6799316 | Aguilar et al. | Sep 2004 | B1 |
6920537 | Ofek et al. | Jul 2005 | B2 |
6922791 | Mashayekhi et al. | Jul 2005 | B2 |
6943828 | Grimes et al. | Sep 2005 | B1 |
6944788 | Dinker et al. | Sep 2005 | B2 |
6983365 | Douceur et al. | Jan 2006 | B1 |
7137040 | Bae et al. | Nov 2006 | B2 |
7200604 | Forman et al. | Apr 2007 | B2 |
7302593 | Rothman et al. | Nov 2007 | B2 |
7310644 | Adya et al. | Dec 2007 | B2 |
7359920 | Rybicki et al. | Apr 2008 | B1 |
7376722 | Sim et al. | May 2008 | B1 |
7424514 | Noble et al. | Sep 2008 | B2 |
7478113 | De Spiegeleer et al. | Jan 2009 | B1 |
20020198996 | Sreenivasan et al. | Dec 2002 | A1 |
20030041057 | Hepner et al. | Feb 2003 | A1 |
20030079154 | Park et al. | Apr 2003 | A1 |
20030097422 | Richards et al. | May 2003 | A1 |
20030126242 | Chang | Jul 2003 | A1 |
20030159083 | Fukuhara et al. | Aug 2003 | A1 |
20030163780 | Kossa | Aug 2003 | A1 |
20040083358 | Zhou et al. | Apr 2004 | A1 |
20040089482 | Ramsden et al. | May 2004 | A1 |
20040158777 | Bae et al. | Aug 2004 | A1 |
20040243572 | Muto | Dec 2004 | A1 |
20040268340 | Steeb et al. | Dec 2004 | A1 |
20050015780 | McKeon et al. | Jan 2005 | A1 |
20050055418 | Blanc et al. | Mar 2005 | A1 |
20050091375 | Straub et al. | Apr 2005 | A1 |
20050172161 | Cruz et al. | Aug 2005 | A1 |
20050188265 | Pomaranski et al. | Aug 2005 | A1 |
20050198328 | Lee et al. | Sep 2005 | A1 |
20050204183 | Saika | Sep 2005 | A1 |
20050216813 | Cutts et al. | Sep 2005 | A1 |
20050223278 | Saika | Oct 2005 | A1 |
20050283636 | Vasudevan et al. | Dec 2005 | A1 |
20050289390 | Baba | Dec 2005 | A1 |
20060015773 | Singh et al. | Jan 2006 | A1 |
20060047776 | Chieng et al. | Mar 2006 | A1 |
20060053337 | Pomaranski et al. | Mar 2006 | A1 |
20060075294 | Ma et al. | Apr 2006 | A1 |
20060080568 | Subbaraman et al. | Apr 2006 | A1 |
20060085418 | Piper et al. | Apr 2006 | A1 |
20060161637 | Friess et al. | Jul 2006 | A1 |
20060168192 | Sharma et al. | Jul 2006 | A1 |
20070206611 | Shokri et al. | Sep 2007 | A1 |
20070255819 | Hua et al. | Nov 2007 | A1 |
20090070623 | Sciacca | Mar 2009 | A1 |
Entry |
---|
Veritas (Veritas Cluster Server User's Guide—Solaris—5.0); Symantec Corporation; 2006; 700 pages. |
Quéma, et al., “An asynchronous middleware for Grid resource monitoring” (abstract), Sep. 2003, 2 pages, downloaded from http://www3.interscience.wiley.com/journal/107638010/abstract?CRETRY=1&SRETRY=0. |
“Using the Event Monitoring Service”, Hewlett-Packard Corp., Nov. 1999, 102 pages. |
Baker, et al, “javaGMA: A lightweight implementation of the Grid Monitoring Architecture,” UKUUG LISA/Winter Conference, High-Availability and Reliability, Bournemouth, Feb. 25-26, 2004, http://www.ukuug.org/events/winter2004/speakers+abstracts.shtml#id2717511. |
Cottrell, “Network Monitoring Tools,” Stanford Linear Accelerator Center, Stanford University, Palo Alto, CA, Nov. 24, 2007, 29 pages. (Printed from http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html). |
Veritas Cluster Server 4.0 User's Guide, Veritas Corp., Jan. 2004, 715 pages. |
VERITAS Software Corporation, “Comprehensive Application Performance Management with Veritas i3™,” 2003, 3 pages. |
VERITAS Software Corporation, “VERITAS Enterprise Server Automation,” 2003, 8 pages. |
VERITAS Software Corporation, “Server Consolidation with VERITAS OpForce; Optimizing your Infrastructure Availability and Utlizations,” 2003, 6 pages. |
VERITAS Software Corporation, “Server Automation with VERITAS OpForce; Enterprise Operations Automation-Enhancing Workflows, Administration, and Management,” 2003, 6 pages. |
VERITAS Software Corporation, “Rapid Server Deployment with VERITAS OpForce; The Enterprise-Scale, Server Deployment Solution for Heterogeneous Environments,” 2003, 6 pages. |
VERITAS Software Corporation, “High Availability Clustering in a Microsoft Windows Environment; VERITAS Cluster Server for Microsoft Windows NT” 2000, 20 pages. |
VERITAS Software Corporation, “VERITAS Cluster Server v2.0; Technical Overview,” 2001, 12 pages. |
VERITAS Software Corporation, “Managing Application Availability with Application Clustering and the VERITAS Cluster Server Version 2.0,” 2002, 32 pages. |
“SAP High Availability (BC-CCM-HAV)”, Release 4.6C, Copyright Apr. 2001, pp. 1-258. |
Monson, et al. “IBM WebSphere Application Server V6.1 on the Solaris 10 Operating System”, IBM Redbooks, Mar. 2008, 492 pages, 1st Edition Version 61, International Business Machines Corporation. |