Standby SBC backplate

Information

  • Patent Grant
  • 6510529
  • Patent Number
    6,510,529
  • Date Filed
    Wednesday, September 15, 1999
    25 years ago
  • Date Issued
    Tuesday, January 21, 2003
    21 years ago
Abstract
A computer system employs a first computer; a first bus switch coupled to the first computer; a data bus coupled to the first computer via the first bus switch; a second computer; a second bus switch coupled to the second computer, the data bus being coupled to the second computer through the second bus switch; and a monitor system coupled to the first computer, to the first bus switch, and to the second bus switch. The monitor system employs a watchdog timer coupled to a switch over circuit, wherein a watchdog timeout period exceeds a period between executions of a reset code, the reset code being included in software executing on the first computer, wherein a reset signal is generated in response to execution of the reset code, thereby resetting the watchdog timer prior to the watchdog timeout period, and wherein upon a failure in the first computer the reset code is not executed, and therefore the reset signal is not generated, thereby not resetting the watchdog timer prior to the watchdog timeout period, wherein the watchdog timer generates a switch over signal in the event the watchdog timeout period is reached before the watchdog timer is reset, wherein the switch over circuit opens the first data bus switch and closes the second data bus switch in response to the switch over signal.
Description




BACKGROUND OF THE INVENTION




The present invention relates to backup hardware in electronic computer systems, and, in particular, to standby single board computers (SBC's). Even more particularly, the present invention relates to a standby single board computer backplane system and method.




During the past decade, the personal computer industry has literally exploded into the culture and business of many industrialized nations. Personal computers, while first designed for applications of limited scope involving individuals sitting at terminals, producing work products such as documents, databases, and spread sheets, have matured into highly sophisticated and complicated tools. What was once a business machine reserved for home and office applications, has now found numerous deployments in complicated industrial control systems, communications, data gathering, and other industrial and scientific venues. As the power of personal computers has increased by orders of magnitude every year since the introduction of the personal computer, personal computers have been found performing tasks once reserved to mini-computers, mainframes and even supercomputers.




In many of these applications, personal computers perform mission critical tasks involving significant stakes and low tolerance for failure. In these environments, even a single short-lived failure of a personal computer can represent a significant financial event for its owner.




Industrial personal computers are used in critical applications that require much higher levels of reliability than provided by most personal computers. They are used for telephony applications, such as controlling a company's voice mail or e-mail systems. They may be used to control critical machines, such as check sorting, or mail sorting for the U.S. Postal Service. Computer failures in these applications can result in significant loss of revenue or loss of critical information. For this reason, companies seek to purchase industrial personal computers, specifically looking for features that increase reliability, such as better cooling, redundant, hot-swapable power supplies or redundant disk arrays. These features have provided relief for some failures, but these systems are still vulnerable to failures of the single board computer (SBC) within the industrial personal computer system itself. If the processor, memory or support circuitry on a single board computer fails, or software fails, the single board computer can be caused to hangup or behave in such a way that the entire industrial personal computer system fails. Some industry standards heretofore dictated that the solution to this problem is to maintain two completely separate industrial personal computer systems, including redundant single board computers and interface cards. In many cases, these interface cards are very expensive, perhaps as much as ten times the cost of the single board computer.




As a result, various mechanisms for creating redundancy within and between personal computers have been attempted in an effort to provide backup hardware that can take over in the event of a failure.




One approach, mentioned above, to providing backup hardware, referred to herein as complete redundancy, involves maintaining a duplicate (or backup) personal computer and duplicate attendant interface devices, storage devices, chassis and power supplies on hand to either manually or automatically switch control in the event that a primary personal computer fails in one way or another. Unfortunately, this level of redundancy requires that all components of the primary personal computer be duplicated in the backup personal computer. While this provides arguably a maximum degree of redundancy and thus security, it requires that in many instances very expensive or non-critical hardware be duplicated.




For example, in many industrial applications, highly specialized interface boards are used to interface systems with the personal computer. These systems may involve telephony, such as cellular telephony, voice mail data acquisition, monitoring, control, and other such applications. In the event that one of these interface boards were to fail, generally, the remaining operations performed by the personal computer can continue to perform. For example, in the case of a cellular telephone system, the loss of a single interface board may mean that one “line” is out of service, but remaining “lines” remain in service. This level of failure is hardly noticeable by customers of the cellular telephony system, and thus is generally considered tolerable. On the other hand, however, these interface boards are extremely expensive and highly specialized. Thus, maintaining redundancy of these boards is both undesirable and unnecessary.




Unfortunately, prior approaches, including complete redundancy, fail to address this real world fact adequately.




For example, in U.S. Pat. No. 5,185,693, Loftis, et al., teach a backup mode of operation in which a primary personal computer can be replaced by a backup personal computer in the event a failure is detected. Failure is detected through a local area network that couples the primary personal computer to the secondary personal computer. The primary and secondary personal computers are coupled through a complicated bus switch that routes either a bus from the primary personal computer or a bus from the secondary personal computer to a plurality of remotely located (field) input/output units. The input/output units are further coupled to process instrumentation for monitoring and/or controlling an ongoing process, such as a manufacturing process.




In operation, the backup personal computer monitors the status of the primary personal computer through the local area network. Through the local area network, active data in the secondary personal computer is constantly updated with current information concerning process monitoring and control. This local area network connection may further be used to monitor the status of the primary personal computer using the secondary personal computer by, for example, deploying a watchdog timer to detect loss of bus activity. Alternatively, a separate digital output device, coupled to a terminal end of the input/output bus may use a watchdog timer to monitor the bus for a lack of bus activity and to effect the switch over from the primary personal computer to the secondary personal computer in the,event of such loss for more than a timeout period. In either case, in the event a loss of bus activity is detected, a switch switches from the primary personal computer to the secondary personal computer to gain control over the data bus leading to the remotely located input/output units.




Unfortunately, the switch employed in the illustrated device is highly complicated, and thus, is itself, sensitive to failures. In the event the switch does fail, switch over from the primary personal computer to the secondary personal computer cannot occur. Monitoring of the primary personal computer for failures is disadvantageously hindered by the fact that the secondary personal computer, in one embodiment, monitors the primary personal computer—and even then, monitoring is primitive, i.e., bus activity is monitored. Because of this, in the event that the secondary personal computer fails, the primary personal computer will no longer be monitored, and thus the switch over to the secondary personal computer will not occur. And, because no monitoring of the secondary personal computer is performed, this failure of the secondary personal computer will not be detected, thus meaning that the primary personal computer can go unmonitored and unbacked up for a significant period of time without detection. Similarly, in an alternative embodiment, the data output on the remote bus is used to monitor for bus activity, and effect switch over between the primary computer and the secondary computer in the event of the lack of bus activity. Unfortunately, bus activity can be generated by devices other than the primary and secondary personal computers, and thus may not be a good indicator of failure. And, with modern personal computers, a failure in one process on the primary personal computer may not result in a complete failure of the personal computer. Thus, a process can remain locked up while bus activity continues (as a result of activities of other processes on the primary personal computer or remote input/output units), and thus the failure goes undetected. As a result, bus activity may continue despite a catastrophic failure of the primary personal computer.




Furthermore, the approach offered by Loftis, et al., fails to address the principal issue outlined above. Specifically, having a backup of the primary personal computer using the secondary personal computer, while at the same time utilizing a common set of interface cards. Unlike the input/output units shown by Loftis, et al., interface cards are internal to the system of the personal computer, generally housed within a single housing therewith. The external approach offered by Loftis, et al., thus would not offer a solution to the needs of modern industrial computer users.




Other examples of backup systems are shown in U.S. Pat. No. 5,434,998 (Akai, et al.), U.S. Pat. No. 5,583,987 (Kobayashi, et al.), and U.S. Pat. No. 5,729,675 (Miller, et al.).




The present invention addresses the above and other needs.




SUMMARY OF THE INVENTION




The present invention advantageously addresses the needs above as well as other needs by providing a standby computer backplane system and method.




In one embodiment, the invention can be characterized as a computer system employing a first computer; a first bus switch coupled to the first computer; a data bus coupled to the first computer via the first bus switch; a second computer; a second bus switch coupled to the second computer, the data bus being coupled to the second computer through the second bus switch; and a monitor system coupled to the first computer, to the first bus switch, and to the second bus switch. The monitor system employs a watchdog timer coupled to a switch over circuit, wherein a watchdog timeout period exceeds a period between executions of a reset code, the reset code being included in software executing on the first computer, wherein a reset signal is generated in response execution of the reset code, thereby resetting the watchdog timer prior to the watchdog timeout period, and wherein upon a failure in the first computer the reset code is not executed, and therefore the reset signal is not generated, thereby not resetting the watchdog timer prior to the watchdog timeout period, wherein the watchdog timer generates a switch over signal in the event the watchdog timeout period is reached before the watchdog timer is reset, wherein the switch over circuit opens the first data bus switch and closes the second data bus switch in response to the switch over signal.











BRIEF DESCRIPTION OF THE DRAWINGS




The above and other aspects, features and advantages of the present invention will be more apparent from the following more particular description thereof, presented in conjunction with the following drawings wherein:





FIG. 1

is a block diagram of an industrial personal computer system employing a standby single board computer backplane, in which a primary and a second single board computers are selectively coupled through first and second PCI bus switches, respectively, to a primary PCI bus, in accordance with one embodiment of the present invention;





FIG. 2

is a block diagram of another industrial computer system employing another standby single board computer backplane, in which a primary and a second single board computers are selectively coupled through first and second PCI bus switches, respectively, to a primary PCI bus and through first and second ISA bus switches, respectively, to an ISA bus, in accordance with one embodiment of the present invention;





FIG. 3

is a block diagram illustrating a plurality of watchdog timers in a monitor system, which are coupled through an ISA bus to the first single board computer, of

FIGS. 1 and 2

, where corresponding reset code resets the watchdog timers before corresponding watchdog timeout periods in the event the first single board computer is functioning normally, and where one or more instances of the corresponding reset code do not reset the watchdog timers before the corresponding watchdog timeout periods in the even the first single board computer is not functioning normally;





FIGS. 4A

,


4


B,


4


C,


4


D,


4


E,


4


F,


4


G, and


4


H are a schematic diagram showing an exemplary implementation of the industrial personal computer system of

FIG. 1

; and





FIGS. 5A

,


5


B,


5


C,


5


D,


5


E,


5


F,


5


G,


5


H, and


5


I are a schematic diagram showing an exemplary implementation of the industrial personal computer system of FIG.


2


.











Corresponding reference characters indicate corresponding components throughout the several views of the drawings.




DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS




The following description of the presently contemplated best mode of practicing the invention is not to be taken in a limiting sense, but is made merely for the purpose of describing the general principles of the invention. The scope of the invention should be determined with reference to the claims.




Referring to

FIG. 1

, a block diagram is shown of an industrial personal computer system


100


consistent with the present invention and in accordance with one embodiment.




Shown is a first single board computer


102


, or primary personal computer, coupled through a PCI bus


104


switch to a primary PCI bus


106


. The primary PCI bus


106


is coupled to each of three PCI/PCI bridges


108


,


110


,


112


, each of which are coupled to five PCI card slots


114


,


116


,


118


,


120


,


122


,


124


,


126


,


128


,


130


,


132


,


134


,


136


,


138


,


140


,


142


for supporting, in this embodiment, up to


15


different PCI based interface cards. These interface cards can take numerous forms, such as telecommunications control boards, voice mail control boards, data acquisition boards, process control boards, and the like. The PCI/PCI bridges


108


,


110


,


112


function in a conventional, well known manner to convey data between the first single board computer


102


and respective ones of the PCI based interface boards.




The first single board computer


102


is also coupled through a first IDE channel switch


144


to an IDE channel


146


, which is in turn coupled to an IDE device


148


, such as a CD ROM drive, or a hard drive. The first single board computer


102


is coupled through a first floppy disk channel switch


150


to a floppy disk channel


152


on which a floppy disk drive


154


resides. Finally, the first single board computer


102


is coupled through a power switch


156


to a power supply


158


.




Aside from the above-identified switches, i.e., the first PCI bus switch


104


, the first IDE channel switch


144


, the first floppy disk drive channel switch


150


, and the first power switch


156


, the above configuration (as so far described) is typical of industrial personal computer systems employing a single board computer to supply processing and memory capabilities.




Unlike in typical industrial personal computer systems, however, with this embodiment, a monitor system


160


is coupled to the first single board computer


102


through an industry standard architecture (ISA) bus


162


. Through the ISA bus


162


, the monitor system


160


is able to reset one or more watchdog timers in response to signals from the first single board computer


102


. Unlike in prior systems, these signals are generated by the first single board computer


102


in response to custom code within software operating on the first single board computer


102


. The custom code may be for example in an operating system, driver, application program, or the like.




For example, within the software operating on the first single board computer, there may be custom code programmed to periodically cause the generation of the signals, during normal operation. In this case, in the event that the signals are at some point not generated, such would be an indication that a particular portion of the software in which the custom code is located is not operating normally on the first single board computer


102


.




Within the system monitor


160


, the watchdog timers are configured to cause a fault condition when they are not reset after a predetermined period of time. Thus, if one or more of the signals are not generated, because there is a fault in one or more particular portion of the software, the watchdog timers corresponding to those particular portions of the software will fail to be reset and, after the predetermined period of time, will signal a fault. In response to this, the monitor system


160


can, for example, signal an operator that a fault has occurred, such as by illuminating a light on a front panel on a housing of the computer system. In response to observing the light, the operator can then effect a manual switch over from the first single board computer


102


to the second single board computer


164


at a convenient time. (Manual switch over can be effected, for example, by operating a switch on the front panel of the housing. When manual switch over is effected, the monitor system


160


is signaled to perform the switch over in the matter described below in reference to an automated switch over alternative.)




Alternatively, the monitor system


160


can be configured to automatically decouple the first single board computer


102


from the primary PCI bus


106


, the IDE channel


146


, the floppy disk drive channel


152


, and the power supply


158


, by opening the switches


104


,


144


,


150


,


156


. In this case, a second single board computer


164


is coupled through a second bus switch


166


to the primary PCI bus


106


; is coupled to the IDE channel


146


through the second IDE channel switch


168


; is coupled to the floppy drive channel


152


through a second floppy drive channel switch


170


; and is coupled to the power supply


158


through a second power switch


172


.




Thus, the monitor system


160


is able to simultaneously decouple the first single board computer


102


from the primary PCI bus


106


, the IDE channel


146


, the floppy disk drive channel


152


and the power supply


158


, while coupling the second single board computer


164


to the primary PCI bus


106


; the IDE channel


146


; the floppy disk drive channel


152


; and the power supply


158


. As a result, the first single board computer


102


will, in effect, disappear, while simultaneously the second single board computer


164


will appear, as far as the PCI based interface cards, the IDE device


148


, and the floppy disk drive


154


are concerned. In response to the application of power to the second single board computer


164


, the second single board computer


164


will begin to boot up (i.e., perform bootstrap operations), and thus will initialize the PCI based interface cards and load software from the IDE device


148


, such as a CD ROM device, or the floppy disk drive


156


(from a floppy disk). As a result, within moments of a failure of the first single board computer


102


being detected, the second single board computer


164


begins to boot, and will, shortly thereafter, generally on the order of a minute or two, resume operation in place of the first single board computer


102


.




Note that the first IDE channel switch


144


and the second IDE channel switch


168


may together form a priority IDE channel switch. In this case, both the first single board computer


102


and the second single board computer


164


remain coupled to the IDE channel


146


at all times, with either the first single board computer


102


or the second single-board computer


164


having priority over the other for access to the IDE channel


146


. Priority may be either electronically or manually switchable or may be assigned to either the first single board computer


102


or the second single board computer


164


permanently. similarly, the first floppy disk drive channel switch


150


and the second floppy disk drive channel switch


168


may together form a priority floppy disk drive channel switch, maintaining both the first single board computer


102


and the second single board computer


164


coupled to the floppy disk drive channel


152


, with either the first single board computer


102


or the second single board computer


164


having priority, as determined either electronically, manually, or permanently.




Monitoring of the second single board computer


164


is performed in a manner analogous to that described above for monitoring the first single board computer


102


, except that the second single board computer


164


is coupled to and communicates with the monitor system


160


via a serial port


174


as opposed to the ISA bus


162


. Advantageously, the custom code in the software generates the signals on both the ISA bus


162


and the serial port


174


simultaneously, so identical software can be executed by first single board computer


102


and the second single board computer


164


, with the unused signals, i.e., the signals generated on the second single board computer's ISA bus, and the signals generated on the first single board computer's serial port being ignored.




Advantageously, the same PCI interface cards are used through the same extremely high speed PCI bus, regardless of whether or not the first single board computer or the second single board computer is active. Similarly, the same IDE device


148


, i.e., CD ROM drives or hard drives, are employed, and thus data recorded during operation of the industrial personal computer system


10


is maintained; and the same floppy disk drive


154


is used so, for example, a single boot disk can be employed.




This is particularly advantageous because the PCI based interface cards


114


,


116


,


118


,


120


,


122


,


124


,


126


,


128


,


130


,


132


,


134


,


136


,


138


,


140


,


142


used in the PCI bus slots can be highly specialized and extremely expensive devices, while at the same time, shutdown of the entire industrial personal computer system


10


can be catastrophic.




Because failure of a single PCI based interface card is generally not catastrophic, these PCI based interface cards


114


,


116


,


118


,


120


,


122


,


124


,


126


,


128


,


130


,


132


,


134


,


134


,


136


,


138


,


140


,


142


need not, in accordance with the present embodiment, be maintained redundantly. At the same time, however, redundancy can be maintained on such critical components as the first single board computer


102


so that significant downtime does not occur upon a failure. Further advantageously, the monitor system


160


operates completely independently of the first single board computer


102


and the second single board computer


164


. Thus, the second single board computer


164


, for example, can be maintained in a completely powered down, and, therefore, relatively safer condition, while the first single board computer


102


is actively monitored. Furthermore, the monitor system


160


can, by design, be substantially independent in functioning from the first single board computer, with the exception of receiving signals generated by particular portions of the software running on the first single board computer


102


, and in response to which the monitor system


160


resets the watchdog timers. As a result, software failures (even partial software failures involving only one particular portion of the software) and/or hardware failures on the first single board computer


102


do not adversely affect the ability of the monitor system


160


to perform its critical function.




Finally, advantageously, simple Field Effect Transistor (FET) switches are employed as the first PCI bus switch


104


and the second PCI bus switch


166


allowing extremely fast switch over between the first single board computer and the second single board computer, while at the same time maintaining a highly simple and effective mechanism for switching.




Since power is removed from the first single board computer


102


on the detection of a fault, maintenance personnel can be alerted and can replace the first single board computer


102


after a failure while the industrial personal computer system continues to run. In this case the computer system will continue to run using the second single board computer


164


. Because the monitor system


160


is coupled to the second single board computer


164


through a serial port


174


, the second single board computer


164


can continue to operate until another fault is signaled. In that case, the system monitor can activate the first single board computer


102


, and deactivate the second single board computer


164


, allowing maintenance personnel to then replace the second single board computer


164


.




In a variation, both single board computers can be provided with power at all times. Independent operation of the first power switch


156


or the second power switch


172


can allow replacement of the first or second single board computer


102


or


164


, respectively. With both single board computers


102


,


164


running, the second single board computer


164


can be communicating with the first single board computer via, for example, the serial port


174


, so as to be up to date on critical application statuses. Switch over, in this case, simply involves disconnection of the first single board computer


102


from the primary PCI bus


106


using the first PCI bus switch


104


, the IDE channel


146


using the first IDE channel switch


144


, and the floppy drive channel using the floppy drive switch


150


, and connection of the second single board computer


164


to the primary PCI bus


106


using the secured PCI bus switch


166


, the IDE channel


146


using the second IDE channel switch


168


and the floppy drive channel


152


using the second floppy drive channel switch


170


. Switch over in this instance can be accomplished much more quickly because a re-boot is not required. However, this approach requires altering application software and perhaps operating systems software in a more significant way.




Referring to

FIG. 2

, a block diagram is shown of an industrial personal computer system


200


consistent with the present invention and in accordance with one embodiment.




Shown is a first single board computer


102


, or primary personal computer, coupled through a first PCI bus switch


204


to a primary PCI bus


206


. The primary PCI bus


206


is coupled to each of three PCI/PCI bridges


208


,


212


, each of which are coupled to five PCI card slots


214


,


216


,


218


,


220


,


222


,


224


,


226


,


228


,


230


,


232


,


234


,


236


,


238


,


240


,


242


for supporting, in this embodiment, up to


15


different PCI based interface cards. These interface cards can take numerous forms, such as telecommunications control boards, voice mail control boards, data acquisition boards, process control boards, and the like. The PCI/PCI bridges


208


,


212


function in a conventional, well known manner to convey data between the first single board computer


202


and respective ones of the PCI based interface boards.




Also shows in the first single board computer


202


coupled through a first ISA bus switch


274


to an ISA bus


275


. The ISA bus is coupled to a number of ISA card slots


278


,


280


,


282


,


284


,


286


,


288


,


290


,


292


,


294


,


296


,


298


,


299


for supporting various ISA based interface cards. These interface cards can also take numerous forms, such as telecommunications control boards, voice mail control boards, data acquisition boards, process control boards, and the like.




The first single board computer


202


is also coupled through a first IDE channel switch


244


to an IDE channel


246


, which is in turn coupled to an IDE device


248


as a CD ROM drive, or a hard drive. The first signal board computer


202


is coupled through a first floppy disk channel switch


250


to a floppy disk channel


252


on which a floppy disk drive


254


resides. Finally, the first single board computer


202


is coupled through a power switch


256


to a power supply


258


.




Aside from the above-identified switches, i.e., the first PCI bus switch


204


, the first ISA bus switch


274


, the first IDE channel switch


244


, the first floppy disk drive channel switch


252


, and the first power switch


256


, the above configuration (as so far described) is typical of industrial personal computer systems employing a single board computer to supply processing and memory capabilities.




Unlike in typical industrial personal computer systems, however, with this embodiment, a monitor system


260


is coupled to the first single board computer


202


through an (ISA) bus


262


. Through the ISA bus


262


, the monitor system


260


is able to reset various watchdog timers in response to signals from the first single board computer


202


. Unlike in prior systems, these signals are generated by the first single board computer


202


in response to custom code within software operating on the first single board computer


202


. For example, the software may be programmed to periodically cause the generation of the signals, during normal operation. In this case, in the event that the signals are at some point not generated, such would be an indication that a particular portion of the software is not operating normally on the first single board computer


202


. Within the system monitor


260


, the watchdog timers are configured to cause a fault condition when they are not reset after a predetermined period of time. Thus, if one or more of the signals are not generated, because there is a fault in one or more particular portion of the software, the watchdog timers corresponding to those particular portions of the software will fail to be reset and, after the predetermined period of time, will signal a fault. In response to this, the system monitor


260


can, for example, signal an operator that a fault has occurred, such as by illuminating a light on a front panel on the computer system.




Alternatively, the monitor system


260


can be configured to automatically decouple the first single board computer


202


from the primary PCI bus


206


, the ISA bus


275


, the IDE channel


246


, the floppy disk drive channel


252


, and the power supply


258


, by opening the switches


204


,


274


,


244


,


250


,


256


. In this case, a second single board computer


264


is coupled through a second bus switch


266


to the primary PCI bus


206


; is coupled through a second ISA bus switch


276


to the ISA bus


275


; is coupled to the IDE channel


246


through the second IDE channel switch


268


; is coupled to the floppy drive channel


252


through a second floppy drive channel switch


270


; and is coupled to the power supply


258


through a second power switch


272


.




Thus, as with the embodiment described with reference to

FIG. 1

, the monitor system


260


is able to simultaneously decouple the first single board computer


202


from the primary PCI bus


206


; the IDE channel


246


; the floppy disk drive channel


252


and the power supply


258


, while coupling the second single board computer


264


to the primary PCI bus


260


; the IDE channel


246


; the floppy disk drive channel


252


; and the power supply


258


. In addition, the monitor system


260


is able to simultaneously decouple the first single board computer


202


from the ISA bus


275


, while coupling the second single board computer


264


to the ISA bus


275


. As a result, the first single board computer


202


will, in effect, disappear while simultaneously the second single board computer


264


will appear, as far as the PCI based interface cards, ISA based interface cards, the IDE device


248


, and the floppy disk drive


254


are concerned. As with the embodiment of

FIG. 1

, in response to the application of power to the second single board computer


264


, the second single board computer


264


will begin to boot, and thus will initialize the PCI based interface cards and the ISA based interface cards, and load software from the IDE device


248


, such as a CD ROM device, or the floppy disk drive


256


(from a floppy disk). As a result, within moments of a failure of the first single board computer


202


being detected, the second single board computer


264


begins to boot, and will shortly thereafter, generally on the order of a minute or two, resume operation in place of the first single board computer


202


. Monitoring of the second single board computer


264


is performed in a manner analogous to that described above for monitoring the first single board computer


202


, except that the second single board computer


264


is coupled to and communicates with the monitor system


260


via a serial port


274


as opposed to the ISA bus


262


.




Advantageously, the same PCI based interface cards and the same ISA based interfaced cards are used through the same PCI bus, or ISA bus, respectively, regardless of whether or not the first single board computer or the second single board computer is active. Similarly, as with the embodiment of

FIG. 1

, the same IDE device


248


, i.e., CD ROM drives or hard drives, are employed, and thus data recorded during operation of the industrial personal computer system


20


is maintained; and the same floppy disk drive


254


is used so, for example, a single boot disk can be employed.




Thus this embodiment offers all of the advantages of the embodiment of

FIG. 1

, while additionally providing for switch over of the first single board computer


202


to the second single board computer on the ISA bus


275


. As with the PCI based interface cards, the ISA based interface cards used in the ISA bus slots can be highly specialized and extremely expensive devices, while at the same time, shutdown of the entire industrial personal computer system


20


can be catastrophic.




As with the PCI based interface cards, the failure of a single ISA based interface card is generally not catastrophic.




Finally, simple Field Effect Transistor (FET) switches are also employed as the first ISA bus switch


274


and the second ISA bus switch


266


, again, allowing extremely fast switch over between the first single board computer and the second single board computer, while at the same time maintaining a highly simple and effective mechanism for switching.




In all other material respects the embodiment of

FIG. 2

is identical to the embodiment of

FIG. 1

, and the variations of the embodiment of

FIG. 1

similarly applicable to the embodiment of

FIG. 2

, Thus, further detailed explanation is not repeated. Instead the reader is directed to the description of

FIG. 1

for further details and embodiments regarding the structure, operation, features and advantages of the present embodiment (the embodiment of FIG.


2


).




Referring to

FIG. 3

, a block diagram is shown of the monitor system


360


, the ISA bus


362


, the first single board computer


302


, the serial port


374


, and the second single board computer


364


. Also shown within the monitor system


360


are a plurality of watchdog timers


304


,


306


,


308


, each coupled through the ISA bus


362


to respective custom code


310


,


312


,


314


within software within the first single board computer


302


. Further shown within the second single board computer is custom code


316


,


318


,


320


coupled through the serial port


374


to the watchdog timers


304


,


306


,


308


. As described above, the watchdog timers


304


,


306


,


308


operate independently from one another, each being coupled to a switch over circuit


318


. The switch over circuit


318


effects switch over from the first single board computer


302


to the second single board computer (or vice versa) by operating the switches, as described above, e.g., by opening the first PCI bus switch, and thereby disconnecting the first single board computer


302


from the primary PCI bus, and simultaneously closing the second PCI bus switch, and thereby connecting the second single board computer


302


to the primary PCI bus (or vice versa, i.e., opening the second PCI bus switch and closing the first PCI bus switch).




As described above, the reset code


310


,


312


,


316


periodically executes as a part of normal operation of the software within the first single board computer


302


or the second single board computer


364


. The periodicity of execution of the custom code


310


,


312


,


314


(or reset code) is used, on an individual basis, to determine a watchdog timeout period for each watchdog timer


304


,


306


,


308


. Specifically, each watchdog timeout period is selected to be longer than the normal period between executions of the custom code


310


,


312


,


314


. The watchdog timers


304


,


306


,


308


are reset in response to signals generated on the ISA bus


362


in response to execution of the respective custom code


310


,


312


,


314


within the first single board computer or signals on the serial port


374


in response to execution of the respective custom code


316


,


318


,


320


within the second single board computer


364


. As a result, when the custom code


310


,


312


,


314


is being periodically executed, the watchdog timers


304


,


306


,


308


are reset before their respective watchdog timeout periods are reached. If, however, one or more of the custom code


310


,


312


,


314


processes is not executed, such as would be the case if one or more software routines fails, or of there is a hardware failure on the first single board computer


302


(or the second single board computer


364


), and therefore the corresponding signals are not generated, the watchdog timeout period for the corresponding watchdog timer


304


,


306


,


308


is reached. In response to reaching the respective watchdog timeout period, the respective watchdog timer will signal the switch over circuit


318


to effect a switch over, thus causing the second single board computer (or the first single board computer) to boot, and to take control of the industrial personal computer system.




Referring to

FIG. 4

, shown is a schematic diagram of an exemplary implementation of the industrial personal computer system of FIG.


1


. As the schematic diagram is self-explanatory, in view of the above description presented in reference to

FIGS. 1 and 3

, no further explanation of this schematic is made herein.




Referring to

FIG. 5

, shown is a schematic diagram of an exemplary implementation of the industrial personal computer system of FIG.


2


. As the schematic diagram is self-explanatory, in view of the above description presented in reference to

FIGS. 1

,


2


and


3


, no further explanation of this schematic is made herein. While the invention herein disclosed has been described by means of specific embodiments and applications thereof, numerous modifications and variations could be made thereto by those skilled in the art without departing from the scope of the invention set forth in the claims.



Claims
  • 1. A computer system comprising:a first computer; a first bus switch coupled to the first computer; a data bus coupled to the first computer via the first bus switch; a second computer; a second bus switch coupled to the second computer, the data bus being coupled to the second computer through the second bus switch; a monitor system coupled to the first computer, to the first bus switch, and to the second bus switch, the monitor system comprising a watchdog timer coupled to a switch over circuit, wherein a watchdog timeout period exceeds a period between executions of a reset code, the reset code being included in software executing on the first computer, wherein a reset signal is generated in response to execution of the reset code, thereby resetting the watchdog timer prior to the watchdog timeout period, and wherein upon a failure in the first computer the reset code is not executed, and therefore the reset signal is not generated, thereby not resetting the watchdog timer prior to the watchdog timeout period, wherein the watchdog timer generates a switch over signal in the event the watchdog timeout period is reached before the watchdog timer is reset.
  • 2. The computer system of claim 1 wherein said monitor system is coupled to said second computer, wherein another reset code is included in software executing on the second computer, wherein another reset signal is generated in response execution of the other reset code, thereby resetting the watchdog timer prior to the watchdog timeout period, and wherein upon a failure in the second computer the other reset code is not executed, and therefore the other reset signal is not generated, thereby not resetting the watchdog timer prior to the watchdog timeout period, wherein the watchdog timer generates the switch over signal in the event the watchdog timeout period is reached before the watchdog timer is reset.
  • 3. The computer system of claim 2 wherein the monitor system opens the first data bus switch and closes the second data bus switch in response to the switch over signal, in the event the switch over signal is generated as a result of said reset signal not being generated, and wherein the monitor system opens the second data bus switch and closes the first data bus switch in response to the switch over signal, in the event the switch over signal is generated as a result of said other reset signal not being generated.
  • 4. The computer system of claim 2 wherein the monitor system powers off the first computer and powers on the second computer in response to the switch over signal, in the event the switch over signal is generated as a result of said reset signal not being generated, and wherein the monitor system powers on the first computer and powers off the second computer in response to the switch over signal, in the event the switch over signal is generated as a result of said other reset signal not being generated.
  • 5. The computer system of claim 2 wherein said data bus is a PCI bus.
  • 6. The computer system of claim 2 wherein said data bus is an ISA bus.
US Referenced Citations (8)
Number Name Date Kind
4200226 Piras Apr 1980 A
4610013 Long et al. Sep 1986 A
5155729 Rysko et al. Oct 1992 A
5185693 Loftis et al. Feb 1993 A
5406472 Simmons et al. Apr 1995 A
5434998 Akai et al. Jul 1995 A
5583987 Kobayashi et al. Dec 1996 A
5729675 Miller et al. Mar 1998 A
Non-Patent Literature Citations (1)
Entry
http://www.sbs.com/communications/products/cascade.shtml