Embodiments of the invention generally relate to the field of client/server systems and, more particularly, to a system and method for monitoring availability of servers.
In a large enterprise, software components may perform business-critical tasks, with the components and often forming large software landscapes comprised of many components spread over many host systems operating together in a network. Because the components are software components and run as applications on host operating systems, their availability—their ability to perform the functions for which they are intended—is not easily monitorable using standard operating system monitoring tools, which have no knowledge of application-level software.
To address software components, certain systems include facilities for monitoring the availability of software components, with an agent being used to provide monitoring. However, the existing facilities are generally inadequate for availability monitoring purposes. In conventional systems, the frequency of availability monitoring is generally too infrequent. If a monitoring agent requires constant instruction, then the operation of the agent will require a great amount of processing and communications time. Further, the deployment of a single agent program to conduct actual availability checks may be inadequate for monitoring needs.
A system and method for implementation of monitoring availability of servers.
According to a first embodiment of the invention, a method includes providing a set of monitoring instructions regarding monitoring of the availability of applications to an agent, with the monitoring instructions including a time stamp value. The method further provides for receiving an inquiry from the agent regarding the status of the monitoring instructions, with the inquiry including the time stamp value. The time stamp value is compared to a time value for a current set of instructions, and, if the time value for the current instructions is later than the received time stamp, the current instructions are sent to the agent.
Under a second embodiment of the invention, a system includes a monitoring agent, the monitoring agent to monitor the availability of applications according to a received work list, with the work list including an effective time value. The system further includes a central monitoring system, the central monitoring system maintaining a current work list for the agent. The current work list includes an effective time value, and the control system is to send the current work list to the agent if the effective time value for the current work list is later than the effective time value of the monitoring agent's work list.
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
Embodiments of the invention are generally directed to a system and method for monitoring availability of applications.
As used herein, “application” means a computer application or program.
In one embodiment of the invention, an autonomous agent is provided for monitoring of the availability of applications. In one embodiment, an agent monitors the availability of applications using a set of instructions, which may be a work list of systems to monitor.
In one embodiment of the invention, a monitoring agent periodically or upon the occurrence of some event makes an inquiry to a central monitoring system to determine whether the monitoring agent's work list is current. In one embodiment, the monitoring agent provides a time stamp to the central monitoring system, the time stamp representing an effective time for the work list of the monitoring agent. In other embodiments, another designation, such as a version number, may be used to represent the status of the work list.
In one embodiment, a central monitoring system receives an inquiry from a monitoring agent and determines the status of the work list of the monitoring agent by comparing a time stamp of the work list with a time stamp for a current work list maintained by the central monitoring system. In one embodiment, if the time stamp of the current work list is later than the time stamp of the work list held by the monitoring agent, the central monitoring system determines that the monitoring agent requires a new work list and sends the new work list to the monitoring agent. The monitoring agent then conducts monitoring activities according to the new work list. In one embodiment, if the time stamp of the current work list is not later than the time stamp of the monitoring agent's work list, the central monitoring system determines that the monitoring agent does not require a new work list and sends a “no change” message to the monitoring agent. The monitoring agent then continues monitoring activities according to the work list held by the monitoring agent.
In one embodiment of the invention, a mechanism is provided for running several monitoring agents at the same time. In one embodiment, multiple agents may have either overlapping or separate workloads for operations. In one embodiment, a system may switch back and forth between monitoring agents. In one embodiment, a monitoring agent may be assigned to a group of systems. For example, a first agent is assigned to a first group of systems, a second agent is assigned to a second group of systems, and so on, with possible overlap existing between the groups assigned to the agents. Under an embodiment of the invention, a cross check of availability of systems may be made using multiple monitoring agents. For example, a first agent in a first location and a second agent in a second location may both check system availability for a given system or set of systems, thereby providing views of availability from multiple locations.
In one embodiment of the invention, each agent of a number of agents makes a request for workload update periodically, or upon some other event, and includes a time stamp for the current workload. Each agent then receives confirmation that there is no change in workload, or receives a new workload. Using such method, a central monitoring system is able to utilize multiple agents for monitoring with minimal overhead in directing the agents and minimal communications to maintain current status.
In one embodiment of the invention, an agent operates asynchronously, operating with its own knowledge of the monitoring tasks. In one embodiment, an agent acts independently, without requiring constant direction. In the embodiment, the agent operates with a workload and periodically inquires about an update, with the inquiry providing a time when the current workload became effective. The agent either receives an updated workload or receives a message indicating there is no change. If a no change message is received, the agent continues with the current tasks. If a change message is received, the agent begins the new workload. In one embodiment, the agent pushes data back to the central control, rather than requiring that data be pulled from the agent. The asynchronous and autonomous operation of the agent provides monitoring data without requiring extensive command processes.
In one example, an embodiment of the invention may be implemented in the CCMS (Computing Center Management System) Monitoring Architecture of SAP AG for monitoring the availability of SAP software components. In a CCMS system, an agent designated as a CCMSPING agent program may be deployed to conduct availability checks. In one embodiment, a CCMSPING agent is independent and operates autonomously. The agent conducts availability checks on its own according to a customizable frequency on a work list of software components to check. In one embodiment, multiple such agents may operate simultaneously. To allow flexible distribution of workload and to allow testing of availability from the point of view of various sites, multiple agents may work in parallel on separate work lists, may work in parallel on shared work lists, or may operate with a combination of shared and separate work lists. In one embodiment, each of the agents operates independently of a central monitoring system. In one embodiment, the work lists of multiple agents are synchronized by a central monitoring system.
In one embodiment of the invention, monitored applications may reside on various different computer platforms. In one embodiment, a monitored application may be implemented in a J2EE™ (Java™2 Platform, Enterprise Edition) platform. The J2EE platform is described in the J2EE specification, including as provided in version 1.4, Nov. 24, 2003. In another embodiment, an application may be implemented on an ABAP (Advanced Business Application Programming) platform of SAP AG. In another embodiment, an application may reside in another type of computer platform.
The central monitoring system 105 provides monitoring instructions for the monitoring agents. In one embodiment of the invention, the agents periodically provide inquiries to the central system regarding the state of the monitoring instructions of the agents. In one embodiment, the central system only provides new instructions to monitoring agents when the instructions have changed, based on a time stamp value for the existing instructions.
Periodically the monitoring agent 210 will make an inquiry 225 to the central monitoring system 205 to determine whether the current work list of the monitoring agent 210 is current or whether the work list needs to be modified. The inquiry includes the time stamp for the work list. In one example, the central system compares the received time stamp against a time stamp representing an up-to-date work list for the monitoring agent 210. If the central monitoring system 205 determines that the monitoring agent already has the most current work list, then the central system 205 will send a no change instruction 230 to the monitoring agent 210. Based on the no change instruction 230, the monitoring agent 210 continues the initial monitoring operation 235.
In one example, after a certain amount of time has passed, the monitoring agent 210 sends another inquiry 240 to the central monitoring system 205. In this example, the central monitoring system 205 again compares the received time stamp against a time stamp representing a current work list for the monitoring agent 210. In this case, the time stamp for the up-to-date work list is later than the received time stamp, indicating that the monitoring agent 210 does not have the most current work list. The central monitoring system 205 will then send the new work list to the monitoring agent 245, and the monitoring agent 210 will perform the new monitoring operation based at least in part of the new workload instructions 250.
People integration 402 is performed using a portal solution 412 and a platform to work in collaboration 414. Users are provided a multi-channel access 410 to ensure mobility. Examples of the portal solution 412 include SAP Enterprise Portal, SAP Mobile Engine, and Collaboration Package for SAP Enterprise Portal. Information integration 404 refers to the conversion of information into knowledge. Information integration 404 provides efficient business intelligence 418 and knowledge management 420 using, for example, SAP products such as Business Information Warehouse (BW) and Knowledge Management (KM). Further, consolidation of master data management beyond system boundaries is performed using SAP's Master Data Management (MDM) 416. Process integration 406 refers to optimized process management using integration broker or SAP exchange infrastructure 422 and business process management 424 techniques. Examples of products to perform process integration 406 include Exchange Infrastructure (XI) and Business Process Management (BPM).
An application platform 408 may include SAP's Web Application Server (Web AS), which is the basis for SAP applications. Web AS, which may be independent of the database and operating system 430, includes a J2EE engine 426 in combination with the proprietary ABAP (Advanced Business Application Programming) engine or instance 428 to further enhance the application platform 408.
The architecture 400 further includes a composite application framework 432 to provide various open interfaces (APIs) and a lifecycle management 434, which is an extension of a previously existing transport management system (TMS). As illustrated, the architecture 400 further provides communication with Microsoft.NET 436, International Business Machine (IBM) WebSphere 438, and other such systems 440.
The Web AS 520 with ABAP engine 502 further includes a J2EE program engine 504. The J2EE may support one or more program instances. The J2EE engine 504 is in communication with the ABAP engine 502 via a fast Remote Function Call (RFC) connection 506. The ABAP engine 502 and the J2EE engine 504 are further in communication with an Internet Communication Manager (ICM) 508. The ICM 508 is provided for handling and distributing queries to various individual components of the architecture 500. The architecture 500 further supports a browser 510, such as Microsoft Internet Explorer, Netscape Navigator, and other modified variations of mobile end devices, such as personal digital assistants (PDAs), pocket computers, smart cell phones, other hybrid devices, and the like. The Web AS 520 also supports various protocols and standards 512, such as HyperText Markup Language (HTML), eXtensible Markup Language (XML), Wireless Markup Language (WML), Hypertext Transfer Protocol (HTTP) and Hypertext Transfer Protocol, Secure (HTTP(S)), Simple Mail Transfer Protocol (SMTP), Web Distributed Authority and Versioning (WebDAV), Simple Object Access Protocol (SOAP), Single Sign-On (SSO), Secure Sockets Layer (SSL), X.509, Unicode, and the like.
The computer 605 further comprises a random access memory (RAM) or other dynamic storage as a main memory 625 to store information and instructions to be executed by the processors 615 through 620. The RAM or other main memory 625 also may be used for storing temporary variables or other intermediate information during execution of instructions by the processors 615 through 620.
A hard drive or other storage device or computer-readable storage medium 630 may be used by the computer 605 for storing information and instructions. The storage device or computer-readable storage medium 630 may include a magnetic disk or optical disc and its corresponding drive, flash memory or other nonvolatile memory, or other memory device. Such elements may be combined together or may be separate components. The computer 605 may include a read only memory (ROM) 635 or other static storage device for storing static information and instructions for the processors 615 through 620.
A keyboard or other input device 640 may be coupled to the bus 610 for communicating information or command selections to the processors 615 through 620. The input device 640 may include a keyboard, a keypad, a touch-screen and stylus, a voice-activated system, or other input device, or combinations of such devices. The computer may further include a mouse or other cursor control device 645, which may be a mouse, a trackball, or cursor direction keys to communicate direction information and command selections to the processors and to control cursor movement on a display device. The computer 605 may include a computer display device 650, such as a cathode ray tube (CRT), liquid crystal display (LCD), or other display technology, to display information to a user. In some environments, the display device may be a touch-screen that is also utilized as at least a part of an input device. In some environments, the computer display device 650 may be or may include an auditory device, such as a speaker for providing auditory information.
A communication device 655 may also be coupled to the bus 610. The communication device 655 may include a modem, a transceiver, a wireless modem, or other interface device. The computer 605 may be linked to a network or to other device using via an interface 660, which may include links to the Internet, a local area network, or another environment. The computer 605 may comprise a server that connects to multiple devices. In one embodiment the computer 605 comprises a Java compatible server that is connected to user devices and to external resources.
It should be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the invention.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Number | Name | Date | Kind |
---|---|---|---|
6061721 | Ismael et al. | May 2000 | A |
6061740 | Ferguson et al. | May 2000 | A |
6083281 | Diec et al. | Jul 2000 | A |
6134581 | Ismael et al. | Oct 2000 | A |
6144967 | Nock | Nov 2000 | A |
6205476 | Hayes, Jr. | Mar 2001 | B1 |
6212520 | Maruyama et al. | Apr 2001 | B1 |
6260187 | Cirne | Jul 2001 | B1 |
6272537 | Kekic et al. | Aug 2001 | B1 |
6289506 | Kwong et al. | Sep 2001 | B1 |
6308208 | Jung et al. | Oct 2001 | B1 |
6356931 | Ismael et al. | Mar 2002 | B2 |
6389464 | Krishnamurthy et al. | May 2002 | B1 |
6466973 | Jaffe | Oct 2002 | B2 |
6539501 | Edwards | Mar 2003 | B1 |
6553403 | Jarriel et al. | Apr 2003 | B1 |
6567809 | Santosuosso | May 2003 | B2 |
6664978 | Kekic et al. | Dec 2003 | B1 |
6681232 | Sistanizadeh | Jan 2004 | B1 |
6738933 | Fraenkel et al. | May 2004 | B2 |
6742178 | Berry et al. | May 2004 | B1 |
6772178 | Mandal et al. | Aug 2004 | B2 |
6789257 | MacPhail | Sep 2004 | B1 |
6792460 | Oulu et al. | Sep 2004 | B2 |
6802067 | Camp et al. | Oct 2004 | B1 |
6834301 | Hanchett | Dec 2004 | B1 |
6851118 | Ismael et al. | Feb 2005 | B1 |
6853995 | Matsuzaki et al. | Feb 2005 | B2 |
6880125 | Fry | Apr 2005 | B2 |
6925631 | Golden | Aug 2005 | B2 |
6990601 | Tsuneya et al. | Jan 2006 | B1 |
7000235 | Mandal et al. | Feb 2006 | B2 |
7017051 | Patrick | Mar 2006 | B2 |
7017162 | Smith et al. | Mar 2006 | B2 |
7024474 | Clubb et al. | Apr 2006 | B2 |
7051324 | Gissel et al. | May 2006 | B2 |
7058558 | Reichenthal | Jun 2006 | B2 |
7062540 | Reddy et al. | Jun 2006 | B2 |
7069267 | Spencer, Jr. | Jun 2006 | B2 |
7082464 | Hasan et al. | Jul 2006 | B2 |
7085851 | Nusbickel et al. | Aug 2006 | B2 |
7086065 | Yeluripati et al. | Aug 2006 | B1 |
7150014 | Graupner | Dec 2006 | B2 |
7152104 | Musante et al. | Dec 2006 | B2 |
7174370 | Saini et al. | Feb 2007 | B1 |
7197559 | Goldstein et al. | Mar 2007 | B2 |
7200588 | Srivastava et al. | Apr 2007 | B1 |
7206827 | Viswanath et al. | Apr 2007 | B2 |
7209963 | Burton et al. | Apr 2007 | B2 |
7496896 | Bley et al. | Feb 2009 | B2 |
20020029298 | Wilson | Mar 2002 | A1 |
20020075325 | Allor et al. | Jun 2002 | A1 |
20020170036 | Cobb et al. | Nov 2002 | A1 |
20020186238 | Sylor et al. | Dec 2002 | A1 |
20030041142 | Zhang et al. | Feb 2003 | A1 |
20030110252 | Yang-Huffman | Jun 2003 | A1 |
20030120593 | Bansal et al. | Jun 2003 | A1 |
20030167304 | Zhu et al. | Sep 2003 | A1 |
20030177477 | Fuchs | Sep 2003 | A1 |
20040003122 | Melillo | Jan 2004 | A1 |
20040019662 | Viswanath et al. | Jan 2004 | A1 |
20040019669 | Viswanath et al. | Jan 2004 | A1 |
20040019684 | Potter et al. | Jan 2004 | A1 |
20040022237 | Elliott et al. | Feb 2004 | A1 |
20040058652 | McGregor et al. | Mar 2004 | A1 |
20040064552 | Chong et al. | Apr 2004 | A1 |
20040078722 | Pfeiffer et al. | Apr 2004 | A1 |
20040123279 | Boykin et al. | Jun 2004 | A1 |
20040148610 | Tsun et al. | Jul 2004 | A1 |
20040158837 | Sengodan | Aug 2004 | A1 |
20040215649 | Whalen et al. | Oct 2004 | A1 |
20040230973 | Cundiff, Jr. et al. | Nov 2004 | A1 |
20040249613 | Sprogis et al. | Dec 2004 | A1 |
20040268314 | Kollman et al. | Dec 2004 | A1 |
20050010608 | Horikawa | Jan 2005 | A1 |
20050038889 | Frietsch | Feb 2005 | A1 |
20050039171 | Avakian et al. | Feb 2005 | A1 |
20050097110 | Nishanov et al. | May 2005 | A1 |
20050102536 | Patrick et al. | May 2005 | A1 |
20050132337 | Wedel et al. | Jun 2005 | A1 |
20050172306 | Agarwal et al. | Aug 2005 | A1 |
20050216584 | Chisholm | Sep 2005 | A1 |
20050234931 | Yip et al. | Oct 2005 | A1 |
20050234967 | Draluk et al. | Oct 2005 | A1 |
20050257157 | Gilboa et al. | Nov 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20060149729 A1 | Jul 2006 | US |