A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
Network servers and the accompanying local area networks (LANs) have expanded the power and increased the productivity of the work force. It was just a few years ago that every work station had a standalone personal computer incapable of communicating with any other computers in the office. Data had to be carried from person to person by diskette. Applications had to be purchased for each standalone personal computer at great expense. Capital intensive hardware such as printers were duplicated for each standalone personal computer. Security and backing up the data were immensely difficult without centralization.
Network servers and their LANs addressed many of these issues. Network servers allow for resource sharing such as sharing equipment, applications, data, and the means for handling data. Centralized backup and security were seen as definite advantages. Furthermore, networks offered new services such as electronic mail. However, it soon became clear that the network servers could have their disadvantages as well.
Centralization, hailed as a solution, developed its own problems. A predicament that might shut down a single standalone personal computer would, in a centralized network, shut down all the networked work stations. Small difficulties easily get magnified with centralization, as is the case with the failure of a network server interface card (NIC), a common dilemma. A NIC may be a card configured for Ethernet, LAN, or Token-Ring to name but a few. These cards fail occasionally requiring examination, repair, or even replacement. Unfortunately, the entire network has to be powered down in order to remove, replace or examine a NIC. Since it is not uncommon for modern network servers to have sixteen or more NICs, the frequency of the problem compounds along with the consequences. When the network server is down, none of the workstations in the office network system will be able to access the centralized data and centralized applications. Moreover, even if only the data or only the application is centralized, a work station will suffer decreased performance.
Frequent down times can be extremely expensive in many ways. When the network server is down, worker productivity comes to a stand still. There is no sharing of data, applications or equipment such as spread sheets, word processors, and printers. Bills cannot go out and orders cannot be entered. Sales and customer service representatives are unable to obtain product information or pull up invoices. Customers browsing or hoping to browse through a network server supported commercial web page are abruptly cut off or are unable to access the web pages. Such frustrations may manifest themselves in the permanent loss of customers, or at the least, in the lowering of consumer opinion with regard to a vendor, a vendor's product, or a vendor's service. Certainly, down time for a vendor's network server will reflect badly upon the vendor's reliability. Furthermore, the vendor will have to pay for more service calls. Rebooting a network server, after all, does require a certain amount of expertise. Overall, whenever the network server has to shut down, it costs the owner both time and money, and each server shut down may have ramifications far into the future. The magnitude of this problem is evidenced by the great cost that owners of network servers are willing to absorb in order to avoid down time through the purchase of uninterruptible power supplies, surge protects, and redundant hard drives.
What is needed to address these problems is an apparatus that can localize and isolate the problem module from the rest of the network server and allow for the removal and replacement of the problem module without powering down the network server.
The present invention includes methods of removing and replacing data processing circuitry. In one embodiment, the method comprises changing an interface card in a computer comprising removing a network interface module from the computer without powering down the computer and removing an interface card from the network interface module. The further acts of replacing the interface card into the network interface module and replacing the network interface module into the computer without powering down the network computer are also performed in accordance with this method.
Methods of making hot swappable network servers are also provided. For example, one embodiment comprises a method of electrically coupling a central processing unit of a network server to a plurality of network interface modules comprising the acts of routing an I/O bus having a first format from the central processing unit to primary sides of a plurality of bus adaptor chips and routing an I/O bus of the same first format from a secondary side of the bus adaptor chips to respective ones of the network interface modules.
Embodiments of the present invention will now be described with reference to the accompanying Figures, wherein like numerals refer to like elements throughout. The terminology used in the description presented herein is intended to be interpreted in its broadest reasonable manner, even though it is being utilized in conjunction with a detailed description of certain specific embodiments of the present invention. This is further emphasized below with respect to some particular terms used herein. Any terminology intended to be interpreted by the reader in any restricted manner will be overtly and specifically defined as such in this specification.
In the server of
In advantageous embodiments described in detail with reference to
Referring now to
An ISA Bridge 218 is connected to the bus system 212 to support legacy devices such as a keyboard, one or more floppy disk drives and a mouse. A network of microcontrollers 225 is also interfaced to the ISA bus 226 to monitor and diagnose the environmental health of the fault tolerant system.
The two PC buses 214 and 216 contain bridges 242, 244, 246 and 248 to PC bus systems 250, 252, 254, and 256. As with the PC buses 214 and 216, the PC buses 250, 252, 254 and 256 can be designed according to any type of bus architecture including PCI, ISA, EISA, and Microchannel. The PC buses 250, 252, 254 and 256 are connected, respectively, to a canister 258, 260, 262 and 264. These canisters are casings for a detachable bus system and provide multiple slots for adapters. In the illustrated canister, there are four adapter slots. The mechanical design of the canisters is described in more detail below in conjunction with
The physical arrangement of the components of the fault tolerant computer shown in
A central processing unit (CPU) module 103 which may advantageously include the system board 182 of
In this embodiment, the CPU module 103 is removably mounted on the top chassis shelf 175A. The next chassis shelf 175B below holds two removably mounted network interface modules 104 and one removably mounted power module 105. The remaining chassis shelf 175C also holds two removably mounted network interface modules 104 and one removably mounted power module 105. The network interface modules 104 and the power modules 105 are guided into place with the assistance of guide rails such as guide rail 180.
In one embodiment of the invention, the network interface modules 104 and the power modules 105 are connected to the CPU module 103 through an interconnection assembly module 209 (illustrated in additional detail in
Thus, with the interconnection assembly module 209 mounted on the chassis 170, the network interface modules 104 can be brought in and out of connection with the network server 100 by engaging and disengaging the network interface module 104 to and from its associated backplane board connector. One embodiment of these connectors is described in additional detail with reference to
In
In this Figure, the front of the interconnection assembly module 209 mounted on the rear of the chassis is partially in view.
In addition, one of the high density connectors 413 which interconnects the backplane printed circuit board 184 with one of the network interface modules 104 is shown in
As is also shown in
In one embodiment of the present invention, the I/O buses 341, 344, 349, and 350 are isolated by bus adapter chips 331, 332, 333 and 334. These bus adapter chips 331, 332, 333, and 334 provide, among other services, arbitered access and speed matching along the I/O bus. One possible embodiment uses the DEC 21152 Bridge chip as the bus adapter 331, 332, 333 or 334.
Several advantages of the present invention are provided by the bus adapter chips 331 through 334 as they may be configured to provide electrical termination and isolation when the corresponding network interface module 104 has been removed from its shelf on the chassis. Thus, in this embodiment, the bridge 331, 332, 333 or 334 acts as a terminator so that the removal and replacement of a network interface module 104 from its shelf of the chassis 170, through an electrical removal and insertion is not an electrical disruption on the primary side of the bridge chip 331, 332, 333 or 334. It is the primary side of the bridge chip 331B, 332B, 333B, or 334B which ultimately leads to the CPU module 103. Thus, the bridge chip 331, 332, 333 or 334 provides isolation for upstream electrical circuitry on the backplane printed circuit board 184 and ultimately for the CPU module 103 through an arbitration and I/O controller chip 351 or 352. As mentioned above, this embodiment uses a PCI bus for the I/O bus. In such an instance, the bridge chip is a PCI to PCI bridge. The arbitration and I/O controller chip 351 or 352 (not illustrated in
Interface cards may be slipped into or removed from the interface card slots 562 when the canister 560 is removed from its shelf 175B or 175C in the chassis 170. An interface card slot 562 be empty or may be filled with a general interface card. The general interface card may be a network interface card (NIC) such as, but not limited to, an Ethernet card or other local area network (LAN) card, with a corresponding NIC cable connected to the NIC and routed from the server 100 to a LAN. The general interface card may be a small computer system interface (SCSI) controller card with a corresponding SCSI controller card cable connected to the SCSI controller card. In this embodiment, the SCSI controller card is connected by a corresponding SCSI controller card cable to a data storage module which may be connected to data storage modules such as hard disks 106 or other data storage device. Furthermore, the general interface card need not be a NIC or an SCSI controller card, but may be some other compatible controller card. The canister front 560A also has bay windows 564 from which the general interface card cable may attach to a general interface card. Unused bay windows may be closed off with bay window covers 565.
The network interface module 104 also has a novel cooling system. Each network interface module 104 extends beyond the chassis rear, and in this portion, may include a pair of separately removable fans 566A and 566B. The separately removable fans are positioned in series with one separately removable fan 566B behind the other separately removable fan 566A. The pair of separately removable fans 566A and 566B run at reduced power and reduced speed unless one of the separately removable fans 566A or 566B fails, in which case, the remaining working separately removable fan 566B or 566A will run at increased power and increased speed to compensate for the failed separately removable fan 566A or 566B. The placement of the separately removable fans 566A and 566B beyond the chassis rear make them readily accessible from the behind the rack 102. Accessibility is desirable since the separately removable fans 566A and 566B may be removed and replaced without powering down or removing the network interface module 104.
To further assist with the cooling of the canister 560, the canister 560 has sufficient sets of perforations 567 in such pattern to assist in cooling the canister 560. In this embodiment, the perforations 567 are holes in the canister 560 placed in the pattern of roughly a rectangular region.
A significant advantage of this embodiment is the ability to change a general interface card in a network server 100 without powering down the network server 100 or the CPU module 103. To change a general interface card, it is desirable to first identify the bridge chip 331, 332, 333 or 334 whose secondary side is connected to the network interface module 104 containing the general interface card to be changed.
Assuming that the general interface card that needs to be changed is in the network interface module 104 which is connected by PCI bus and high density connector to bridge chip 331, to remove the network interface module 104 without disrupting operation of the other portions of the server 100, the bridge chip 331 may become an electrical termination to isolate the electrical hardware of the network server from the electrical removal or insertion on the bridge chip secondary side 331A. This may be accomplished by having the CPU module 103 place the secondary side 331A, 332A, 333A or 334A of the bridge into a reset mode and having circuitry on the printed circuit board 561 of the network interface module 104 power down the canister 560 including the general interface cards within the canister 560. Once the canister 560 is powered down and the bridge chip has electrically isolated the network interface module from the rest of the electrical hardware in the network server 100, then the network interface module 104 may be pulled out its shelf 175B in the chassis 170. After the network interface module 104 has been removed, then the general interface card can be removed from its interface card slot 562 and replaced. Subsequently, the network interface module 104 is removably mounted again on the shelf 175B in the chassis 170. The electrical hardware on the printed circuit board 561 of the network interface module 104 may then power up the canister 560 including the general interface cards within the canister 560. The bridge chip secondary side 331A, 332A, 333A or 334A is brought out of reset by the CPU module 103 and the network interface module 104 is again functional.
At no time during the procedure did the network server 100 or the CPU module 103 have to be powered down. Although the one network interface module 104 was powered down during the procedure, the other network interface modules were still functioning normally. In fact, any workstation connected to the network server 100 by means other than the affected network interface module 104 would still have total access to the CPU module 103, the other network interface modules, and all the networks and data storage modules such as, but not limited to hard disks, CD-ROM modules, or other data storage devices that do not rely upon the general interface cards inside the removed network interface module. This is a desired advantage since network server down time can be very costly to customers and to vendors, can create poor customer opinion of the vendor, vendor's products and services, and decrease overall computing throughput.
The foregoing description details certain embodiments of the present invention and describes the best mode contemplated. It will be appreciated, however, that no matter how detailed the foregoing appears in text, the invention can be practiced in many ways. As is also stated above, it should be noted that the use of particular terminology when describing certain features or aspects of the present invention should not be taken to imply that the broadest reasonable meaning of such terminology is not intended, or that the terminology is being re-defined herein to be restricted to including any specific characteristics of the features or aspects of the invention with which that terminology is associated. The scope of the present invention should therefore be construed in accordance with the appended Claims and any equivalents thereof.
The present application is a continuation U.S. patent application Ser. No. 10/808,220, filed Mar. 23, 2004, which is a continuation of U.S. patent application Ser. No. 10/016,296, filed Oct. 30, 2001, now U.S. Pat. No. 6,742,069, which is a continuation of U.S. patent application Ser. No. 08/943,044, filed on Oct. 1, 1997, now U.S. Pat. No. 6,324,608. Moreover, the benefit under 35 U.S.C. § 119(e) of the following U.S. provisional applications is hereby claimed: TitleAppl. No.Filing Date“Hardware and Software60/047,016May 13, 1997Architecture for Inter-Connecting an EnvironmentalManagement System with a RemoteInterface”“Self Management Protocol60/046,416May 13, 1997for a Fly-By-Wire ServiceProcessor”“Isolated Interrupt Structure60/047,003May 13, 1997for Input/Output Architecture”“Three Bus Server Architecture60/046,490May 13, 1997with a Legacy PCI Bus and MirroredI/O PCI Buses”“Computer System Hardware60/046,398May 13, 1997Infrastructurefor Hot PluggingSingle and Multi-Function PCCards Without Embedded Bridges”“Computer System Hardware60/046,312May 13, 1997Infrastructure for Hot PluggingMulti-Function PCI Cards WithEmbedded Bridges” The subject matter of U.S. Pat. No. 6,175,490 entitled “FAULT TOLERANT COMPUTER SYSTEM”, issued on Jan. 16, 2001, is related to this application. The following patent applications, commonly owned and filed Oct. 1, 1997, are hereby incorporated herein in their entirety by reference thereto: Attorney DocketTitleApplication No.Patent No.No.“System Architecture for Remote Access and Control of08/942,1606,266,721MTIPAT.114AEnvironmental Management”“Method of Remote Access and Control of Environmental08/942,2156,189,109MTIPAT.115AManagement”“System for Independent Powering of Diagnostic Processes08/942,4106,202,160MTIPAT.116Aon a Computer System”“Method of Independent Powering of Diagnostic Processes08/942,3206,134,668MTIPAT.117Aon a Computer System”“Diagnostic and Managing Distributed Processor System”08/942,4026,338,150MTIPAT.118A“Method for Managing a Distributed Processor System”08/942,4486,249,885MTIPAT.119A“System for Mapping Environmental Resources to08/942,2226,122,758MTIPAT.120AMemory for Program Access”“Method for Mapping Environmental Resources to08/942,2146,199,173MTIPAT.121AMemory for Program Access”“Hot Add of Devices Software Architecture”08/942,3096,499,073MTIPAT.122A“Method for The Hot Add of Devices”08/942,3066,247,080MTIPAT.126A“Hot Swap of Devices Software Architecture”08/942,3116,192,434MTIPAT.130A“Method for The Hot Swap of Devices”08/942,4576,304,929MTIPAT.123A“Method for the Hot Add of a Network Adapter on a08/943,0725,892,928MTIPAT.127ASystem Including a Dynamically Loaded Adapter Driver”“Method for the Hot Add of a Mass Storage Adapter on08/942,0696,219,734MTIPAT.131Aa System Including a Statically Loaded Adapter Driver”“Method for the Hot Add of a Network Adapter on a08/942,4656,202,111MTIPAT.124ASystem Including a Statically Loaded Adapter Driver”“Method for the Hot Add of a Mass Storage Adapter on08/962,9636,179,486MTIPAT.125Aa System Including a Dynamically Loaded Adapter Driver”“Method for the Hot Swap of a Network Adapter on a08/943,0785,889,965MTIPAT.128ASystem Including a Dynamically Loaded Adapter Driver”“Method for the Hot Swap of a Mass Storage Adapter on08/942,3366,249,828MTIPAT.129Aa System Including a Statically Loaded Adapter Driver”“Method for the Hot Swap of a Network Adapter on a08/942,4596,170,028MTIPAT.132ASystem Including a Statically Loaded Adapter Driver”“Method for the Hot Swap of a Mass Storage Adapter on08/942,4586,173,346MTIPAT.133Aa System Including a Dynamically Loaded Adapter Driver”“Method of Performing an Extensive Diagnostic08/942,4636,035,420MTIPAT.155ATest in Conjunction with a BIOS Test Routine”“Apparatus for Performing an Extensive Diagnostic08/942,1636,009,541MTIPAT.156ATest in Conjunction with a BIOS Test Routine”“Configuration Management Method for Hot Adding08/941,2686,148,355MTIPAT.134Aand Hot Replacing Devices”“Configuration Management System for Hot Adding08/942,4086,243,773MTIPAT.135Aand Hot Replacing Devices”“Apparatus for Interfacing Buses”08/942,3826,182,180MTIPAT.136A“Method for Interfacing Buses”08/942,4135,987,554MTIPAT.137A“Computer Fan Speed Control Device”08/942,4475,990,582MTIPAT.091A“Computer Fan Speed Control Method”08/942,2165,962,933MTIPAT.092A“System for Powering Up and Powering Down a Server”08/943,0766,122,746MTIPAT.089A“Method of Powering Up and Powering Down a Server”08/943,0776,163,849MTIPAT.090A“System for Resetting a Server”08/942,3336,065,053MTIPAT.095A“Method of Resetting a Server”08/942,405MTIPAT.096A“System for Displaying Flight Recorder”08/942,0706,138,250MTIPAT.097A“Method of Displaying Flight Recorder”08/942,0686,073,255MTIPAT.098A“Synchronous Communication Interface”08/943,3556,219,711MTIPAT.099A“Synchronous Communication Emulation”08/942,0046,068,661MTIPAT.100A“Software System Facilitating the Replacement or08/942,3176,134,615MTIPAT.101AInsertion of Devices in a Computer System”“Method for Facilitating the Replacement or08/942,3166,134,614MTIPAT.102AInsertion of Devices in a Computer System”“System Management Graphical User Interface”08/943,357MNFRAME.028A“Display of System Information”08/942,1956,046,742MTIPAT.103A“Data Management System Supporting Hot Plug08/942,1296,105,089MTIPAT.138AOperations on a Computer”“Data Management Method Supporting Hot Plug08/942,1246,058,445MTIPAT.139AOperations on a Computer”“Alert Configurator and Manager”08/942,0056,425,000MTIPAT.140A“Managing Computer System Alerts”08/943,3566,553,416MTIPAT.141A“Computer Fan Speed Control System”08/940,3016,247,898MTIPAT.093A“Computer Fan Speed Control System Method”08/941,2676,526,333MTIPAT.094A“Black Box Recorder for Information System Events”08/942,3816,269,412MTIPAT.104A“Method of Recording Information System Events”08/942,1646,282,673MTIPAT.105A“Method for Automatically Reporting a System Failure08/942,1686,243,838MTIPAT.106Ain a Server”“System for Automatically Reporting a System Failure08/942,3846,170,067MTIPAT.107Ain a Server”“Expansion of PCI Bus Loading Capacity”08/942,4046,249,834MTIPAT.108A“Method for Expanding PCI Bus Loading Capacity”08/942,2236,195,717MTIPAT.109A“System for Displaying System Status”08/942,3476,145,098MTIPAT.142A“Method of Displaying System Status”08/942,0716,088,816MTIPAT.143A“Fault Tolerant Computer System”08/942,1946,175,490MTIPAT.144A“Method for Hot Swapping of Network Components”08/943,0446,324,608MTIPAT.145A“A Method for Communicating a Software Generated08/942,2216,163,853MTIPAT.146APulse Waveform Between Two Servers in a Network”“A System for Communicating a Software Generated08/942,4096,272,648MTIPAT.147APulse Waveform Between Two Servers in a Network”“Method for Clustering Software Applications”08/942,3186,134,673MTIPAT.149A“System for Clustering Software Applications”08/942,4116,363,497MTIPAT.148A“Method for Automatically Configuring a Server08/942,3196,212,585MTIPAT.150Aafter Hot Add of a Device”“System for Automatically Configuring a Server08/942,3316,263,387MTIPAT.151Aafter Hot Add of a Device”“Method of Automatically Configuring and Formatting08/942,4126,154,835MTIPAT.152Aa Computer System and Installing Software”“System for Automatically Configuring and Formatting08/941,9556,138,179MTIPAT.153Aa Computer System and Installing Software”“Determining Slot Numbers in a Computer”08/942,4626,269,417MTIPAT.154A“System for Detecting Errors in a Network”08/942,169MNFRAME.058A“Method of Detecting Errors in a Network”08/940,302MNFRAME.059A“System for Detecting Network Errors”08/942,407MNFRAME.060A“Method of Detecting Network Errors”08/942,573MNFRAME.061A
Number | Date | Country | |
---|---|---|---|
60047016 | May 1997 | US | |
60046416 | May 1997 | US | |
60047003 | May 1997 | US | |
60046490 | May 1997 | US | |
60046398 | May 1997 | US | |
60046312 | May 1997 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10808220 | Mar 2004 | US |
Child | 11417943 | May 2006 | US |
Parent | 10016296 | Oct 2001 | US |
Child | 10808220 | Mar 2004 | US |
Parent | 08943044 | Oct 1997 | US |
Child | 10016296 | Oct 2001 | US |