The present invention relates generally to the field of virtual servers.
Virtual servers may be used to make more efficient use of physical servers. Physical servers may be added into a virtual frame data repository in order to be managed by a virtual frame management system. The physical servers have many attributes, and in a large environment, it becomes tedious and error-prone to enter these details manually. Additionally, once assets are added, it is important to be able to locate the equipment as well as correlate the added equipment with interconnected equipment to enable the assessment of changes to equipment. For example, if a system administrator knows that server A is connected to switch B then the system administrator can determine that bringing switch B down for maintenance will have an effect on service provided by server A.
Some existing systems, such as Cisco Discovery Protocol (CDP) either provide information or the ability to derive information about which switch port a physical server interface is connected to. Some existing switches provide information on what MAC addresses are visible from a switch port. Also, it is common for some switches to provide SNMP traps on linkUp and linkDown events.
However, not all switches provide this information. Furthermore, conventional systems fail to correlate virtual servers with physical servers and their corresponding physical connections.
The invention is illustrated by way of example, and not limitation, in the figures of the accompanying drawings, in which like reference numerals indicate the same or similar features unless otherwise indicated.
In the drawings,
A method and a system for physical server discovery and correlation in a virtual server environment are described. It will be evident, however, to one skilled in the art that the present application may be practiced without these specific details.
Referring to
The system 10 is shown, by way of example, to include a switch group 12 including one or more switches 14, 16. The switch group 12 is connected, for example, via an InfiniBand link 18 to one or more server pools 20. By way of example, three physical server pools 20.1-20.3 (on which the virtual servers are deployed) are shown in
The switch 14 is shown to communicate with plurality of different networks (Local Area Networks, Wide Area Networks, or the like) via communication links 38, 40, 42. For example, the communication links 38, 40, 42 may be Ethernet connections and, accordingly, each switch 14, 16 may include one or more Ethernet gateway modules 44. In the example system 10, the communication link 38 is shown to connect to a network 46 interconnecting a plurality of hosts 48.1-48.5. The hosts 48.1-48.5 may form part of another data network, or be any other network host.
The switch 14 is also shown to communicate via the communication link 40 to a network 50 which may, for example, be an enterprise network. The network 50 is shown to communicate with desktop computers 52.1-52.2 and a subnet 54 which, in turn, is connected to desktop computers 56.1-56.3. Further, the switch 14 is also shown to connect via the communication link 42 to a network such as the Internet 58. It will however be appreciated that the aforementioned networks are merely example networks and different configurations and different numbers of networks and subnets may be provided that connect a wide range of network devices.
The system 10 may allow virtualization of servers deployed on the physical servers 22.1-22.n that may be managed by a management module 60, which is shown, by way of example, to reside at the switch 14. It will, however, be appreciated that the management module 60 may reside in other components. The management module 60 communicates with a virtual frame director 62 that controls the provisioning of the server pools 20.1-20.3. In an example embodiment, the virtual frame director 62 communicates via a network 64 with the management module 60. The system 10 also includes a third party management tool 65 that communicates with the virtual frame director 62 and/or with the management module 60 to manage the provisioning of virtual servers. In an example embodiment, the network 64 is an Ethernet network and, accordingly, the switch 14 may thus include one or more Ethernet ports 66. It will however be appreciated that the various communication links linking the various components/devices in the system 10 are not restricted to InfiniBand connections, Ethernet connections, or the like. Any communication means may be provided to interconnect the various components.
Referring to
As shown by way of example in
Referring to
The system 100 is shown to include a number of physical servers 140.1140.n in a physical server bank 140 (which may correspond to any one of the server pools 20.1-20.3 in
An InfiniBand subnet manager 130 is connected to the network fabric 120 to manage nodes in the network fabric 120 including configuring switch ports and identifying their associated connections (e.g., physical addresses of HCAs). Thus, in an example embodiment, the subnet manager 130 runs a software management module for communicating with the one or more switches 120.1-120.n and the one or more servers 140.1-140.n. Management module(s) (e.g., see
In an example embodiment, as described in more detail below, the virtual frame director 62 directs provisioning of the plurality of virtual servers 110.1-110.n on the physical servers 140.1-140.n and is configured to monitor an event related to a link between the physical servers 140.1-140.n and the switches 120.1-120.n and, in response to the event, update a virtual server database (e.g., virtual frame director repository or database 150). Monitoring or enforcing policies of one or more of the virtual servers 110.1-110.n may, for example, be in response to a linkUp event and/or linkDown event.
When the event related to one or more links occur (e.g., the physical server 140.1, in the physical server bank 140 powers on) links between the InfiniBand host channel adapter (HCA) of that physical server 140.1 and the switch or switches (e.g., the switch 120.1, in network fabric 120 to which the physical server 140.1 is connected) are established (see block 162 in
In the given example, the switch management module for the switch 120.1 that is part of the example link may detect that the link state for one of its switch ports has transitioned to the up state, and may send a linkUp trap to the registered receivers which includes the virtual frame director 62 (see block 166 in
When the waiting period ends, the virtual frame director 62 may query the subnet manager 130 to refresh the virtual frame director's view of the connection topology (e.g., the InfiniBand topology) as shown at block 170. The virtual frame director 62 may also query each switch 120.1-120.n to learn the mappings of switch ports to internal switch chips of the switches 120.1-120.n. When the refresh is finished, the virtual frame director 62 may perform a linkUp event handling process for each trap that was received (see block 174). Thus, upon occurrence of an event the virtual frame director 62 may automatically, without human intervention, detect changes in network topology. This is in contrast to systems that periodically discover a status of the network topology where an event may take place (e.g., a physical server may go up and down) during the period between the periodic discovery instances. In contrast, the method 160 may thus dynamically respond to network events and provide a real-time “snapshot” of virtual server-related configuration data.
The handling process in block 174 may be dependent upon whether the event is a linkup event or a linkDown event. The handling process may include provisioning new virtual server, executing (or at least initiating) a monitoring policy, or perform any other system related operations, diagnostics, or the like.
Referring to
The parent node data structure may also contain a global unique identifier (GUID) of the node, which is the GUID of the HCA (e.g., see HCA 94 in
Thereafter, as shown at block 190, information about the physical server (e.g., physical server 140.1) associated with this HCA is retrieved from the virtual frame director repository 150. For example, the existing HCA bindings for this HCA may be retrieved from the virtual frame director repository 150. The HCA bindings may be contained in an HCA data structure and include the HCA port number (derived from the server port identified in box 186), switch id (derived from data repository by knowing the IP address of the switch), switch slot, and switch port (the switch slot and port may be derived from the ifIndex). The current set of HCA bindings may be checked to determine if there is an existing binding for the same HCA port. A new HCA binding data structure may be added if one does not exist or the information is used to update the existing one. The virtual frame director repository 150 may then be updated with the new information.
The virtual server (e.g., a virtual server 110.1-110.n) associated with the physical server (e.g., a physical server 140.1-140.n) is then identified (see block 192) using the information in the virtual frame director repository 150. If a physical server 140.1-140.n is not assigned to a virtual server, then an associated virtual server may not be found. A message may then be logged recording the linkUp event (and optionally associated details) and describing details of the physical server, the virtual server (if assigned), the switch, a switch slot, and a switch port. At this point, additional handling processes can be performed based on the identified information (see block 194). For example, by identifying the virtual server (e.g., 110.1) assigned to a physical server (e.g., 140.1) that is associated with a linkUp event, a system administrator, or the system itself (automatically), could locate a policy associated with the virtual server that is affected by the linkUp event.
It will thus be noted that methods 160 and 180 dynamically monitor and/or detect changes in network topology (e.g., changes in associations between virtual servers and physical servers, changes in physical connections between physical server and switches/switch ports) so that any at given time comprehensive information on the virtual server system 100 is available (e.g., in the virtual frame director repository 150).
As mentioned above, the methods 160 and 180 are not restricted to events such as linkUp event but also relate linkDown events. On linkDown events, the process may be substantially similar (or even the same) to update the network topology information in the virtual frame director data repository. However, in an example embodiment, the link event handling functionality may differ when it is determined to be a linkDown event. The same correlation methods (e.g., the method 180) may be used to find out what virtual server was affected by the linkDown event. If a virtual server (e.g., virtual server 110.1) is determined to be assigned to the physical server (e.g., physical server 140.1) that had a linkDown event, the virtual frame director 62 may send a signal to a policy system/module to check that server. This signal may cause the policy system to prioritize checking that server (e.g., physical server 140.1) instead of waiting in a normal schedule to perform checks. For example, the policy/monitoring system could try to ping the host to determine if it was still reachable on an alternate physical link. In this manner, it may be possible to detect certain failures in a virtual server system earlier in a dynamic manner rather than responding to periodic checking routines. Thus events relating to a connection between a physical server and a switch may automatically trigger virtual server related actions and procedures.
An example of a data structure for the HCA (e.g., in the virtual frame director repository 150) is:
An example of a node data structure for the HCA binding is:
Thus, the physical servers 140.1-140.n may each have a database identifier (id). Using this identifier, the method 180 can find the HCAs in the physical server 140.1-140.n. Using the HCA id, the method 180 can find the bindings for the HCA.
The methods, systems and devices described herein may combine data from traps, a subnet manager, a switch configuration, and a virtual frame director configuration to provide information about virtual servers (e.g., the virtual server 110.110.n) in addition to physical servers (e.g., the physical server 140.1-140.n). The example methods and system described herein may also tie in to a data repository and policy engine for automated actions (e.g., adding servers, updating bindings, reprioritizing policies).
With the example methods, system and devices described herein, new physical servers can be added to the virtual server environment by plugging them into the network and turning the power on. The methods, system and devices may make it faster to detect failures on virtual servers. For example, removing a blade from a blade server would trigger a linkDown event. This would reprioritize the policy for that server and, using the methods described herein, the failure would be detected faster than waiting to poll that server. The methods, system and devices also facilitate adding additional handlers to do virtual server specific processing for events.
The methods, system and devices may provide the ability for administrators to track down equipment in a virtual server system. In an example embodiment, the bindings may be displayed with the physical server in a user interface (UI). Using data in the virtual frame director repository 150, a system administrator can then locate the appropriate switch, slot, port and follow the cable to the server if needed.
The methods, system and devices may also provide the ability to do impact analysis, such as what servers will be affected if a particular switch is taken down. More advanced functionality may also be performed, such as having a virtual frame director (e.g., a virtual frame director 62) migrate active physical servers to other switches, or facilitate a rolling upgrade (e.g., only update one switch at a time in a set where a server is dual connected). If a switch does fail and the failure is detected, a report may be generated to show the compromised servers and/or migrate them to available servers.
The exemplary computer system 200 includes a processor 202 (e.g., a central processing unit (CPU) a graphics processing unit (GPU) or both), a main memory 204 and a static memory 206, which communicate with each other via a bus 208. The computer system 200 may further include a video display unit 210 (e.g., a liquid crystal display (LCD)). The computer system 200 also includes an alphanumeric input device 212 (e.g., a keyboard), a cursor control device 214 (e.g., a mouse), a disk drive unit 216, a signal generation device 218 (e.g., a speaker) and a network interface device 220.
The disk drive unit 216 includes a machine-readable medium 222 on which is stored one or more sets of instructions (e.g., software 224) embodying any one or more of the methodologies or functions described herein. The software 224 may also reside, completely or at least partially, within the main memory 204 and/or within the processor 202 during execution thereof by the computer system 200, the main memory 204 and the processor 202 also constituting machine-readable media.
The software 224 may further be transmitted or received over a network 226 via the network interface device 220.
While the machine-readable medium 222 is shown in an exemplary embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform anyone or more of the methodologies of the present invention. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.
Thus, a method and a system for physical server discovery and correlation in a virtual server environment are described. Although the present invention has been described with reference to specific exemplary embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the invention. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
This patent application claims the benefit of priority, under 35 U.S.C. Section 119(e), to U.S. Provisional Patent Application Ser. No. 60/746,254, filed on May 2, 2006, the entire content of which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5671357 | Humblet et al. | Sep 1997 | A |
6308315 | Dice et al. | Oct 2001 | B1 |
6370656 | Olarig et al. | Apr 2002 | B1 |
6421711 | Blumenau et al. | Jul 2002 | B1 |
6631395 | Chessell | Oct 2003 | B1 |
6701449 | Davis et al. | Mar 2004 | B1 |
6801949 | Bruck et al. | Oct 2004 | B1 |
7055056 | Bessire | May 2006 | B2 |
7069296 | Moller et al. | Jun 2006 | B2 |
7076801 | Gong et al. | Jul 2006 | B2 |
7111300 | Salas et al. | Sep 2006 | B1 |
7127633 | Olson et al. | Oct 2006 | B1 |
7190667 | Susnow et al. | Mar 2007 | B2 |
7213065 | Watt | May 2007 | B2 |
7240234 | Morita et al. | Jul 2007 | B2 |
7299290 | Karpoff | Nov 2007 | B2 |
7356679 | Le et al. | Apr 2008 | B1 |
7366742 | Umbehocker et al. | Apr 2008 | B1 |
7379990 | Tsao | May 2008 | B2 |
7428584 | Yamamoto et al. | Sep 2008 | B2 |
7469289 | Arakawa et al. | Dec 2008 | B2 |
7546354 | Fan et al. | Jun 2009 | B1 |
7561593 | Wilkie | Jul 2009 | B1 |
7706303 | Bose et al. | Apr 2010 | B2 |
7747816 | Nourmohamadian et al. | Jun 2010 | B1 |
8176153 | Bose | May 2012 | B2 |
8266472 | Bose | Sep 2012 | B2 |
20020001307 | Nguyen et al. | Jan 2002 | A1 |
20020069324 | Gerasimov et al. | Jun 2002 | A1 |
20020165961 | Everdell et al. | Nov 2002 | A1 |
20020188711 | Meyer et al. | Dec 2002 | A1 |
20020188870 | Gong et al. | Dec 2002 | A1 |
20030005028 | Dritschler et al. | Jan 2003 | A1 |
20030005125 | Berthaud et al. | Jan 2003 | A1 |
20030018927 | Gadir et al. | Jan 2003 | A1 |
20030078996 | Baldwin | Apr 2003 | A1 |
20030103455 | Pinto | Jun 2003 | A1 |
20030131246 | Reeves et al. | Jul 2003 | A1 |
20030217131 | Hodge et al. | Nov 2003 | A1 |
20040054725 | Moller et al. | Mar 2004 | A1 |
20040236633 | Knauerhase et al. | Nov 2004 | A1 |
20050028028 | Jibbe | Feb 2005 | A1 |
20050120160 | Plouffe et al. | Jun 2005 | A1 |
20050262505 | Esfahany et al. | Nov 2005 | A1 |
20060075055 | Littlefield | Apr 2006 | A1 |
20060107087 | Sieroka et al. | May 2006 | A1 |
20060114917 | Raisch | Jun 2006 | A1 |
20060129877 | Yamamoto et al. | Jun 2006 | A1 |
20060143617 | Knauerhase et al. | Jun 2006 | A1 |
20060294516 | Winner et al. | Dec 2006 | A1 |
20070027973 | Stein et al. | Feb 2007 | A1 |
20070067435 | Landis et al. | Mar 2007 | A1 |
20070234334 | Araujo, Jr. et al. | Oct 2007 | A1 |
20070234337 | Suzuki et al. | Oct 2007 | A1 |
20070250608 | Watt | Oct 2007 | A1 |
20070258388 | Bose | Nov 2007 | A1 |
20070283015 | Jackson et al. | Dec 2007 | A1 |
20070294563 | Bose | Dec 2007 | A1 |
20070297428 | Bose et al. | Dec 2007 | A1 |
20070299906 | Bose et al. | Dec 2007 | A1 |
20090282404 | Khandekar et al. | Nov 2009 | A1 |
20100228840 | Bose et al. | Sep 2010 | A1 |
Entry |
---|
U.S. Appl. No. 11/460,949, Advisory Action mailed Jun. 18, 2009. |
U.S. Appl. No. 11/460,949, Decision on Pre-Appeal Brief Request mailed Sep. 25, 2009, 3 pgs. |
U.S. Appl. No. 11/460,949, Final Office Action mailed Mar. 30, 2009, 8 pgs. |
U.S. Appl. No. 11/460,949, Final Office Action mailed Jul. 9, 2010, 11 pgs. |
U.S. Appl. No. 11/460,949, Final Office Action mailed Sep. 28, 2011, 5 pgs. |
U.S. Appl. No. 11/460,949, Non Final Office Action mailed Apr. 14, 2011, 11 pgs. |
U.S. Appl. No. 11/460,949, Non Final Office Action mailed Nov. 19, 2010, 5 pgs. |
U.S. Appl. No. 11/460,949, Non-Final Office Action mailed Jan. 8, 2010, 9 pgs. |
U.S. Appl. No. 11/460,949, Non-Final Office Action mailed Sep. 30, 2008, 08 pgs. |
U.S. Appl. No. 11/460,949, Notice of Allowance mailed May 11, 2012, 8 pgs. |
U.S. Appl. No. 11/460,949, Pre-Appeal Brief Request filed Jun. 30, 2009, 5 pgs. |
U.S. Appl. No. 11/460,949, Response filed Feb. 1, 2011 to Non Final Office Action mailed Nov. 19, 2010, 9 pgs. |
U.S. Appl. No. 11/460,949, Response filed Mar. 16, 2012 to Final Office Action mailed Sep. 28, 2011, 10 pgs. |
U.S. Appl. No. 11/460,949, Response filed Apr. 7, 2010 to Non Final Office Action mailed Jan. 8, 2010, 8 pgs. |
U.S. Appl. No. 11/460,949, Response filed May 29, 2009 to Final Office Action mailed Mar. 30, 2009, 9 pgs. |
U.S. Appl. No. 11/460,949, Response filed Aug. 15, 2011 to Non Final Office Action mailed Apr. 14, 2011, 11 pgs. |
U.S. Appl. No. 11/460,949, Response filed Sep. 23, 2010 to Final Office Action mailed Jul. 9, 2010, 11 pgs. |
U.S. Appl. No. 11/460,949, Response filed Dec. 30, 2008 to Non Final Office Action mailed Sep. 30, 2008, 9 pgs. |
U.S. Appl. No. 11/692,736, Examiner interview Summary mailed Jan. 30, 2012, 3 pgs. |
U.S. Appl. No. 11/692,736, Final Office Action mailed Mar. 28, 2011, 22 pgs. |
U.S. Appl. No. 11/692,736, Non Final Office Action mailed Aug. 4, 2010, 20 pgs. |
U.S. Appl. No. 11/692,736, Non Final Office Action mailed Oct. 12, 2011, 22 pgs. |
U.S. Appl. No. 11/692,736, Response filed Jan. 4, 2011 to Non Final Office Action mailed Aug. 4, 2010, 10 pgs. |
U.S. Appl. No. 11/692,736, Response filed Aug. 10, 2011 to Final Office Action mailed Mar. 28, 2011, 10 pgs. |
U.S. Appl. No. 11/734,610, Non Final Office Action mailed Jul. 23, 2009, 17 pgs. |
U.S. Appl. No. 11/734,610, Notice of Allowance mailed Dec. 17, 2009, 4 pgs. |
U.S. Appl. No. 11/734,610, Response filed Nov. 16, 2009 to Non Final Office Action mailed Jul. 23, 2009, 13 pgs. |
U.S. Appl. No. 11/743,537, Final Office Action mailed Feb. 9, 2011, 16 pgs. |
U.S. Appl. No. 11/743,537, Non Final Office Action mailed Mar. 23, 2010, 13 pgs. |
U.S. Appl. No. 11/743,537, Non Final Office Action mailed Aug. 12, 2011, 13 pgs. |
U.S. Appl. No. 11/743,537, Non Final Office Action mailed Sep. 3, 2010, 18 pgs. |
U.S. Appl. No. 11/743,537, Notice of Allowance mailed Jan. 3, 2012, 13 pgs. |
U.S. Appl. No. 11/743,537, Preliminary Amendment filed Jun. 17, 2008, 7 pgs. |
U.S. Appl. No. 11/743,537, Response filed May 9, 2011 to Final Office Action mailed Feb. 9, 2011, 12 pgs. |
U.S. Appl. No. 11/743,537, Response filed Jun. 23, 2010 to Non Final Office Action mailed Mar. 23, 2010, 13 pgs. |
U.S. Appl. No. 11/743,537, Response filed Nov. 14, 2011 to Non Final Office Action mailed Aug. 12, 2011, 8 pgs. |
U.S. Appl. No. 11/743,537, Response filed Dec. 3, 2010 to Non Final Office Action mailed Sep. 3, 2010, 12 pgs. |
U.S. Appl. No. 12/754,489, Preliminary Amendment filed May 25, 2010, 7 pgs. |
McMillan Jr., et al., “Fault Tolerant, High Availability Network Appliance”, Incorporated by reference U.S. Appl. No. 09/552,781, (2000). |
U.S. Appl. No. 11/743,537, Examiner Interview Summary mailed Mar. 28, 2011, 3 pgs. |
Number | Date | Country | |
---|---|---|---|
20070260721 A1 | Nov 2007 | US |
Number | Date | Country | |
---|---|---|---|
60746254 | May 2006 | US |