Heartbeat Heuristics

Information

  • Patent Application
  • 20070180077
  • Publication Number
    20070180077
  • Date Filed
    January 30, 2006
    18 years ago
  • Date Published
    August 02, 2007
    17 years ago
Abstract
A device monitoring system for monitoring a device comprising: a database including a health record for the device; a heartbeat server coupled to the database; a heartbeat agent operating on the device and coupled to the heartbeat server; a heartbeat packet sent from the heartbeat agent to the heartbeat server; and an update to the health record of the device responsive to the heartbeat packet.
Description

DESCRIPTION OF THE DRAWINGS

The present description will be better understood from the following detailed description read in light of the accompanying drawings, wherein:



FIG. 1 is a block diagram showing an example architecture for a device monitoring system.



FIG. 2 is a block diagram showing an example set of workflows in a heartbeat server.



FIG. 3 is a diagram showing a time line of various example intervals, and the various example processes initiated at those intervals by devices and/or heartbeat servers.



FIG. 4 is a block diagram showing an example of a heartbeat check (“HBC”) process.



FIG. 5 is a block diagram showing an example of an agentless ping (ASP) process.



FIG. 6 is a block diagram showing an example computing environment in which the technology described above may be implemented.


Claims
  • 1. A device monitoring system for monitoring a device comprising: a database including a health record representing the device;a heartbeat server coupled to the database;a heartbeat agent operating on the device and coupled to the heartbeat server;a heartbeat packet sent from the heartbeat agent to the heartbeat server; andan update to the health record representing the device, the update responsive to the heartbeat packet.
  • 2. The system of claim 1 further comprising: a group including the heartbeat server and the device;a second group including a second heartbeat server and a second device, the second heartbeat server being coupled to the database, and the second device being coupled to the second heartbeat server.
  • 3. The system of claim 2 further comprising a second heartbeat agent operating on the heartbeat server.
  • 4. The system of claim 3 further comprising: a coupling between the heartbeat server and the second heartbeat server;a second heartbeat packet sent from the second heartbeat agent to the second heartbeat server; anda second update to a second health record, the second health record representing the heartbeat server, the second update responsive to the second heartbeat packet.
  • 5. The system of claim 2 further comprising a ping sent from the second heartbeat server to the second device.
  • 6. The system of claim 5 further comprising: a ping reply responsive to the ping sent from the second device to the second heartbeat server; anda third update to a third health record, the third health record being for the second device, the third update responsive to the ping reply.
  • 7. The system of claim 1 wherein the heartbeat packet conforms to User Datagram Protocol format.
  • 8. The system of claim 1 further comprising a persistent connection between the heartbeat agent and the heartbeat server, the persistent connection conforms to a connection-oriented protocol.
  • 9. The system of claim 8 wherein if the persistent connection is broken then the health record is updated to indicate that the heartbeat agent is “down”.
  • 10. The system of claim 1 wherein the device sends the heartbeat packet to a third heartbeat server, the third heartbeat server being coupled to a second database.
  • 11. The system of claim 1 wherein the heartbeat packet includes information that can be used to help determine if a system or an application operating on the device is capable of performing its intended functions.
  • 12. The system of claim 1 wherein the heartbeat packet includes a timestamp and an identifier that uniquely identifies the device.
  • 13. The system of claim 1 wherein the heartbeat agent includes information identifying the heartbeat server or identifying a backup heartbeat server.
  • 14. A computer-implemented method for monitoring a device, the method comprising: sending a heartbeat packet on a heartbeat send interval from a heartbeat agent to a heartbeat server, the heartbeat agent operating on the device;receiving the heartbeat packet at the heartbeat server;updating a cache entry responsive to the heartbeat packet, the cache entry representing the device; andsetting an indication that the heartbeat packet was received.
  • 15. The method of claim 14 further comprising: checking the indication on a heartbeat check interval; andif the indication has been set, then clearing the indication, but if the indication has not been set, then sending a ping to the device, and thendetecting a ping reply or a lack of a ping reply from the device, and thenupdating the cache entry responsive to the ping reply or the lack of the ping reply.
  • 16. The method of claim 15 further comprising: on a database update interval, determining if the cache entry has been updated; andif the cache entry has been updated, updating a health record responsive to the cache entry.
  • 17. The method of claim 15 further comprising, if the indication has not been set and an “ignore missing heartbeats” flag is set for the device, then skipping the steps of sending, detecting and updating.
  • 18. The method of claim 14 wherein the method is embodied on computer-readable media.
  • 19. A computer-implemented method for monitoring a device, the method comprising: on a heartbeat check interval, determining if a new heartbeat packet has been received from the device; andif the new heartbeat packet has been received, then indicating an “up” status for the device, butif the new heartbeat packet has not been received, then sending a ping to the device, and then determining if a ping reply has been received from the device, and thenif the ping reply has been received, then indicating a “heartbeat agent unavailable” status for the device, butif the ping reply has not been received, then indicating a “down” status for the device.
  • 20. The method of claim 19 wherein the method is embodied on computer-readable media.
Provisional Applications (1)
Number Date Country
60736915 Nov 2005 US