This application is related to U.S. patent application Ser. No. 11/073,045, entitled, “Protecting Data Transactions on an Integrated Circuit Bus,” by M. Insley et al., filed on Mar. 4, 2005, which is hereby incorporated herein by reference.
At least one embodiment of the present invention pertains to remote management of a processing system and more particularly, to a method and apparatus for communicating between an agent and a remote management module in a processing system.
In many types of computer networks, it is desirable to be able to perform certain management related functions on processing system from a remote location. For example, a business enterprise may operate a large computer network that includes numerous client and server processing systems (hereinafter “clients” and “servers”, respectively). With such a network, it may be desirable to allow a network administrator to perform or control various functions on the clients and/or servers from a remote console via the network, such as monitoring various functions and conditions in these devices, configuring the devices, performing diagnostic functions, debugging, software upgrades, etc. To facilitate explanation, such functions are referred to collectively and individually as “management functions”.
One particular application in which it is desirable to have this capability is in a storage-oriented network, i.e., a network that includes one or more storage servers that store and retrieve data on behalf of one or more clients. Such a network may be used, for example, to provide multiple users with access to shared data or to backup mission critical data. An example of such a network is illustrated in
In
Also shown in
In the illustrated configuration, the administrative console 5 must be directly coupled to the storage server 2 and must be local to the storage server 2. This limitation is disadvantageous, in that it may be impractical or inconvenient to locate the administrative console 5 close to the storage server 2. Further, this configuration makes it difficult or impossible to use the same administrative console to manage multiple devices on a network.
Technology does exist to enable management functions to be performed on a computer system remotely via a network. In one approach, a device known as a remote management module (RMM) is incorporated into a processing system to enable remote management of the processing system (referred to as the “host” processing system) via a network. The RMM is often in the form of a dedicated circuit card separate from the other elements of the host processing system. The RMM normally has a network interface that connects to the network and a separate internal interface that connects to one or more components of the processing system. The RMM typically includes control circuitry (e.g., a microprocessor or microcontroller) which is programmed or otherwise configured to respond to commands received from a remote administrative console via the network and to perform at least some of the management functions mentioned above.
One shortcoming of known RMM technology is that the internal interface between the RMM and the host processing system, as well as the software on the RMM, are generally customized for a particular design of host processing system. As a result, it tends to be complicated and expensive to port to an existing RMM design to a different design of host processing system. Furthermore, upgrades or other design changes to the RMM tend to be difficult and expensive.
Hence, it would be desirable to have remote management technology which enables remote management functions on a processing system, such as a storage server, where the remote management technology is more platform-independent, and thus, more readily usable with multiple host processing system designs.
The present invention includes a processing system that comprises control circuitry to control the processing system, a remote management module to enable remote management of the processing system via a network, and an agent to operate as an intermediary between the remote management module and the control circuitry. The agent and the remote management module are configured to cooperatively implement an abstraction layer through which the agent and the remote management module communicate.
Other aspects of the invention will be apparent from the accompanying figures and from the detailed description which follows.
One or more embodiments of the present invention are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
A method and apparatus for communicating event data from an agent to an RMM in a processing system are described. A processing system such as a storage server can include a remote management module, which enables remote management of the processing system via a network, and an agent, which is used to monitor for various events in the processing system and acts as an intermediary between the RMM and the control circuitry of the processing system. In accordance with embodiments of the invention, as described in greater detail below, the agent and the RMM in such a processing system cooperatively implement an abstraction layer, through which the agent and the remote management module communicate event data and other information. The abstraction layer makes the RMM more platform-independent, and thus, more usable for various different designs of host processing system.
The agent continuously monitors for any of various events that may occur within the processing system. The processing system includes sensors to detect at least some of these events. The agent includes a first-in first-out (FIFO) buffer. Each time an event is detected, the agent queues an event record describing the event into the FIFO buffer. When an event record is stored in the FIFO buffer, the agent asserts an interrupt to the RMM. The interrupt remains asserted while event record data is present in the FIFO.
When the RMM detects assertion of the interrupt, the RMM sends a request for the event record data to the agent over a dedicated link between the agent and the RMM. In certain embodiments of the invention, that link is an inter-IC (I1C or I2C) bus. In response to the request, the agent begins dequeuing the event record data from the FIFO and transmits the data to the RMM. The RMM timestamps the event record data as they are dequeued and stores the event record data in a non-volatile event database in the RMM. The RMM may then transmit the event record data to a remote administrative console over the network, where the data can be used to output an event notification to the network administrator.
Events are encoded with event numbers by the agent, and the RMM has knowledge of the encoding scheme. As a result, the RMM can determine the cause of any event (from the event number) without requiring any detailed knowledge of the hardware.
The above-mentioned abstraction layer, which provides greater platform independence, is formed by the use of the above-mentioned queuing and dequeuing of event data, along with a command packet protocol by which the RMM requests and receives event record data from the agent. One advantage of this technique, in addition to greater platform independence, is that the RMM does not have to read the event data from the agent at the same speed at which the agent acquires the event data. Consequently, the RMM can read the event data at a slower rate, for example, then the rate at which the events occur or are detected by the agent.
An example of a network configuration in which this approach can be employed is shown in
Referring now to
The storage server 20 also includes one or more internal mass storage devices 34, a console serial interface 35, a network adapter 36 and a storage adapter 37, which are coupled to the processor(s) through the chipset 33. The storage server 20 may further include redundant power supplies 38, as shown.
The internal mass storage devices 34 may be or include any conventional medium for storing large volumes of data in a non-volatile manner, such as one or more magnetic or optical based disks. The serial interface 35 allows a direct serial connection with a local administrative console, such as console 5 in
The storage server 20 further includes a number of sensors 39 and presence detectors 40. The sensors 39 are used to detect changes in the state of various environmental variables in the storage server 20, such as temperatures, voltages, binary states, etc. The presence detectors 40 are used to detect the presence or absence of various components within the storage server 20, such as a cooling fan, a particular circuit card, etc.
The storage server 20 further includes an RMM 41 and an associated agent 42. The RMM provides a network interface and is used to allow a remote processing system, such as an administrative console, to control and/or perform various management functions on the storage server via network 21, which may be a LAN or a WAN, for example. The management functions may include, for example, monitoring various functions and state in the storage server 20, configuring the storage server 20, performing diagnostic functions on and debugging the storage server 20, upgrading software on the storage server 20, etc. The RMM 41 is designed to operate independently of the storage server 20. Hence, the RMM 41 runs on standby power and/or an independent power supply, so that it is available even when the main power to the storage server 20 is off. In certain embodiments of the invention, the RMM 41 provides diagnostic capabilities for the storage server 20 by maintaining a log of console messages that remain available even when the storage server 20 is down. The RMM 41 is designed to provide enough information to determine when and why the storage server 20 went down, even by providing log information beyond that provided by the operating system of the storage server 20. This functionality includes the ability to send a notice to the remote administrative console 22 on its own initiative, indicating that the storage server 20 is down, even when the storage server 20 is unable to do so.
The agent 42, at a high level, monitors various functions and states within the storage server 20 and acts as an intermediary between the RMM 41 and the other components of the storage server 20. Hence, the agent 42 is coupled to the RMM 41 as well as to the chipset 33 and the processor(s) 31 of the storage server 20, and receives input from the sensors 39 and presence detectors 40.
At a lower level, the agent 42 serves several purposes. First, the agent provides the RMM 41 with certain controls over the storage server 20. These controls include the ability to reset the storage server 20, to generate a non-maskable interrupt (NMI), and to turn on and off the power supplies 38. The agent 42 also monitors the storage server 20 for changes in system-specified signals that are of interest. When any of these signals changes, the agent 42 captures the state of the signal(s) which changed state and presents that data to the RMM 41 for logging. In addition, the agent 42 provides a consolidation point/interrupt controller for the interrupts from the various environmental sensors 39 and detectors 40 in the storage server 20, for use by the host processor(s) 31 of the storage server 20.
Referring now to
The processor(s) 51 is/are the CPU of the RMM 41 and may be, for example, one or more programmable general-purpose or special-purpose microprocessors, DSPs, microcontrollers, ASICs, PLDs, or a combination of such devices. The processor 51 inputs and outputs various control signals and data 55 to and from the agent 42, as described further below.
In at least one embodiment, the processor 51 is a conventional programmable, general-purpose microprocessor which runs software from local memory on the RMM 41 (e.g., flash 52 and/or RAM 53).
The application layer 62 includes a packet layer 72, which cooperates with the serial driver 70, and a control/status decode layer 73 which cooperates with the IIC control module 71. The packet layer 72 is responsible for converting packets received from other modules in the application layer 62 into a serial format for transmission by the serial driver 70, and for converting serial data received from the serial driver 70 into a packet format for use by other modules in application layer 62. The control/status decode layer 73 is responsible for implementing a command packet protocol on the IIC bus for communication with the agent 42, as described further below.
The application layer 62 also includes: a command line interface (CLI) 74 to allow an authorized user to control functions of the RMM 41; an application programming interface (API) 75 to allow an authorized remote application to make calls to the RMM software 60; an event monitoring module 76 to request dequeuing of event data from the agent 42 and to assign timestamps to the dequeued data; an event management module 77 to receive event information from the event monitoring module 76, to manage a local event database in the RMM 41, and to generate outbound alerts for transmission over the network 21 in response to certain events; and a power control module 78 to control power to the storage server 20.
The agent 42 and the RMM 41 are also connected by a bidirectional I1C bus 79, which is primarily used communicating data on monitored signals and states (i.e. event data) from the agent 42 to the RMM 41. A special command packet protocol is implemented on this IIC bus 79, as described further below. Note that in other embodiments of the invention, an interconnect other than IIC can be substituted for the IIC bus 79. For example, in other embodiments the interface provided by IIC bus 79 may be replaced by an SPI, JTAG, USB, IEEE-488, RS-232, LPC, IIC, SMBus, X-Bus or MII interface. The RMM 41 also provides a presence signal PRES to the agent 42, which is a binary signal that indicates to the agent 42 when the RMM 41 is present (installed and operational).
The interface 80 between the agent 42 and the CPU 31 and chipset 33 of the storage server 20 is similar to that between the agent 42 and the RMM 41; however, the details of that interface 80 are not germane to the present invention.
The sensors 39 further are connected to the CPU 31 and chipset 33 by an IIC bus 81. The agent 42 further provides a control signal CTRL to each power supply 38 to enable/disable the power supplies 38 and receives a status signal STATUS from each power supply 38.
In certain embodiments, the agent 42 is embodied as one or more integrated circuit (IC) chips, such as a microcontroller, a microcontroller in combination with an FPGA, or other configuration.
As noted above, the agent 42 acts as an interrupt controller for monitored signals 92 from the sensors 39, presence detectors 40, etc. The process of detecting and responding to events is described now with reference to
The RMM 41 responds by reading the FIFO buffer 89 (as described below) until the agent 41 de-asserts the normal interrupt IRQ (which the agent 42 does when the FIFO buffer 89 becomes empty). The size of the FIFO buffer 89 is chosen such that it can hold at least the maximum number of events that the agent 42 concurrently monitors plus some predetermined number of additional events.
Although the specific format of event records in the FIFO buffer is implementation-specific,
Bits[13:12] are the Event Type field, which encodes each event as one of four possible types of events: Normal system event, Status event, Storage Server Command event, or RMM Command event. With regard to the Normal system event type, when an unmasked event occurs at the input to the agent 42, the event is entered into the FIFO buffer 89 if the signal is not masked. With regard to the Status event type, in response to an “RMM Capture Sensor State” command from the RMM 41 (on the IIC bus 79), the agent 42 scans all of its sensor inputs and places an entry into the FIFO buffer 89 with the Event Type field set to indicate a Status event. With regard to the Storage Server Command event type, certain agent commands associated with the storage server 20 can be specified to result in entries being recorded in the FIFO buffer 89; when such a command is received from the RMM interface, the event type bits are set to indicate a Storage Server Command event. Similarly, certain agent commands associated with the RMM 41 can be specified to result in entries being recorded in the FIFO buffer 89; when such a command is received from the RMM interface, the event type bits are set to indicate an RMM Command event.
Bits[11:0] of the event record are the Signal ID. For Normal and Status events, this field is the encoded signal number (identifier). Each signal is assigned a number with 12 bits, allowing detection of up to 4,000 different events. For Command events, this field contains the command value if the command is designed to generate an event or if the command is a non-supported command. For supported commands, bit[15] of the event record is cleared. Any command received which is not supported by the agent 42 is also placed into the FIFO buffer 89, but with bit [15] set.
The RMM 41 uses a command packet protocol to control the agent 42. This protocol, in combination with the FIFO buffer and described above, provides the abstraction layer 44 between the RMM 41 and the agent 42. In certain embodiments, the command and data link between the RMM 41 and the agent 42 is the IIC bus 79, as described above; however, in other embodiments a different type of link can be used.
The command packet protocol is now further described with reference to
In
In certain embodiments, the Slave Address field is seven bits representing the combination of a preamble (four bits) and slave device ID (three bits). The device ID bits are typically programmable on the slave device (e.g., via pin strapping). Hence, multiple devices can operate on the same IIC bus. “R/W” represents a read/write bit (e.g., “1” for reads, “0” for writes).
As is well-known, IIC does not provide any mechanism to ensure data integrity. Consequently, certain embodiments of the invention add such a mechanism to the communications between the agent 42 and the RMM 41 on the IIC bus 79. In certain embodiments, this mechanism is provided by following each data byte that goes over the IIC bus 79 (i.e., not the Slave Address, the S/ANN or R/W bits) with its 1's complement. This is shown in
To perform a read operation, the RMM 41 issues a special class of Write command, called a Read Setup command, to the agent 42 over the IIC bus 79, to tell the agent 41 what the RMM 42 wants to do next. The RMM 41 then performs a Read operation on the IIC bus 79, to cause the agent 41 to provide the data.
Many different commands may be implemented between the RMM 41 and the agent 42 on the IIC bus 79, depending upon the specific needs of the system. One such command is the Read FIFO command. The Read FIFO command is sent by the RMM 41 over the IIC bus 79 to the agent 42 in response to the agent's assertion of the normal interrupt IRQ, to command the agent 42 to return event data from the FIFO buffer 89. The Read FIFO command is an example of a Read Setup command, which as noted above is actually a special class of Write command. In response to a Read FIFO command, the agent 42 transfers data from the FIFO buffer 89 to the RMM 41 using one or more Read packets. In certain embodiments of the invention, FIFO data is always transferred one event at a time, as follows: an IIC Start (“S”), Slave Address, four data bytes (i.e., FIFO upper data byte and its 1's complement followed by FIFO lower data byte and its 1's complement), IIC Stop (“P”). If no other Read Setup command is issued, a subsequent IIC Read transfer sends the next entry in the FIFO buffer 89 to the RMM 41. FIFO pointers for the FIFO buffer 89 are updated only after the agent 42 has an indication that the transfer has succeeded, as can be determined with IIC error checking. If any of the first three data bytes are NACKed, then the transfer is deemed to have failed, and the FIFO pointers are not updated.
Many other types of commands can be implemented between agent 42 and the RMM 41 on the IIC bus 79 using the above-described command packet protocol. Examples of such commands are commands used to turn the power supplies 38 on or off, to reboot the storage server 20, to read specific registers in the agent 42, and to enable or disable sensors and/or presence detectors. Some of these commands may be recorded by the agent 42 as events in the FIFO buffer 89.
Thus, a method and apparatus for communicating event data between a remote management module and an agent in a processing system have been described. Although the present invention has been described with reference to specific exemplary embodiments, it will be recognized that the invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. Accordingly, the specification and drawings are to be regarded in an illustrative sense rather than a restrictive sense.
Number | Name | Date | Kind |
---|---|---|---|
4670880 | Jitsukawa et al. | Jun 1987 | A |
5241549 | Moon et al. | Aug 1993 | A |
5555438 | Blech et al. | Sep 1996 | A |
5664101 | Picache | Sep 1997 | A |
5761683 | Logan et al. | Jun 1998 | A |
5815652 | Ote et al. | Sep 1998 | A |
5822514 | Steinz et al. | Oct 1998 | A |
5881078 | Hanawa et al. | Mar 1999 | A |
6170067 | Liu et al. | Jan 2001 | B1 |
6195353 | Westberg | Feb 2001 | B1 |
6216226 | Agha et al. | Apr 2001 | B1 |
6237103 | Lam et al. | May 2001 | B1 |
6253300 | Lawrence et al. | Jun 2001 | B1 |
6275526 | Kim | Aug 2001 | B1 |
6335967 | Blomkvist et al. | Jan 2002 | B1 |
6446141 | Nolan et al. | Sep 2002 | B1 |
6480850 | Veldhuisen | Nov 2002 | B1 |
6502088 | Gajda et al. | Dec 2002 | B1 |
6502208 | McLaughlin et al. | Dec 2002 | B1 |
6507929 | Durham et al. | Jan 2003 | B1 |
6574518 | Lounsberry et al. | Jun 2003 | B1 |
6586911 | Smith | Jul 2003 | B1 |
6645077 | Rowe | Nov 2003 | B2 |
6651190 | Worley et al. | Nov 2003 | B1 |
6690733 | Baumgartner et al. | Feb 2004 | B1 |
6697962 | McCrory et al. | Feb 2004 | B1 |
6728908 | Fukuhara et al. | Apr 2004 | B1 |
6920502 | Araujo et al. | Jul 2005 | B2 |
6925531 | Konshak et al. | Aug 2005 | B2 |
6976058 | Brown et al. | Dec 2005 | B1 |
7003563 | Leigh et al. | Feb 2006 | B2 |
7024551 | King et al. | Apr 2006 | B2 |
7114102 | Chan et al. | Sep 2006 | B2 |
7225327 | Rasmussen et al. | May 2007 | B1 |
7234051 | Munguia et al. | Jun 2007 | B2 |
7257741 | Palenik et al. | Aug 2007 | B1 |
7466713 | Saito | Dec 2008 | B2 |
7487343 | Insley et al. | Feb 2009 | B1 |
7584337 | Rowan et al. | Sep 2009 | B2 |
7805629 | Insley et al. | Sep 2010 | B2 |
7865606 | Tewes et al. | Jan 2011 | B1 |
7899680 | Insley et al. | Mar 2011 | B2 |
8090810 | Insley et al. | Jan 2012 | B1 |
20010056483 | Davis | Dec 2001 | A1 |
20020059627 | Islam et al. | May 2002 | A1 |
20020078231 | Chang et al. | Jun 2002 | A1 |
20020129305 | Ahrens et al. | Sep 2002 | A1 |
20020133581 | Schwartz et al. | Sep 2002 | A1 |
20020156840 | Ulrich et al. | Oct 2002 | A1 |
20030008805 | Honma et al. | Jan 2003 | A1 |
20030033361 | Garnett et al. | Feb 2003 | A1 |
20030061320 | Grover et al. | Mar 2003 | A1 |
20030088655 | Leigh et al. | May 2003 | A1 |
20030088805 | Majni et al. | May 2003 | A1 |
20030135748 | Yamada et al. | Jul 2003 | A1 |
20030163651 | Jain et al. | Aug 2003 | A1 |
20030200266 | Henry | Oct 2003 | A1 |
20040064731 | Nguyen et al. | Apr 2004 | A1 |
20040093592 | Rao | May 2004 | A1 |
20040133643 | Warren et al. | Jul 2004 | A1 |
20040136720 | Mahowald et al. | Jul 2004 | A1 |
20040177178 | Gregg et al. | Sep 2004 | A1 |
20040215948 | Abbey et al. | Oct 2004 | A1 |
20040250072 | Ylonen | Dec 2004 | A1 |
20050004974 | Sharma et al. | Jan 2005 | A1 |
20050021722 | Metzger | Jan 2005 | A1 |
20050033952 | Britson | Feb 2005 | A1 |
20050044170 | Cox et al. | Feb 2005 | A1 |
20050044207 | Goss et al. | Feb 2005 | A1 |
20050125118 | Chalker et al. | Jun 2005 | A1 |
20050129035 | Saito | Jun 2005 | A1 |
20050144493 | Cromer et al. | Jun 2005 | A1 |
20050165989 | Kim | Jul 2005 | A1 |
20050188071 | Childress et al. | Aug 2005 | A1 |
20050193021 | Peleg | Sep 2005 | A1 |
20050193182 | Anderson et al. | Sep 2005 | A1 |
20050221722 | Cheong | Oct 2005 | A1 |
20050283606 | Williams | Dec 2005 | A1 |
20050288828 | Claseman | Dec 2005 | A1 |
20050289548 | Farchi et al. | Dec 2005 | A1 |
20060039468 | Emerson et al. | Feb 2006 | A1 |
20060095224 | Lambert | May 2006 | A1 |
20060156054 | Brown et al. | Jul 2006 | A1 |
20060179184 | Fields et al. | Aug 2006 | A1 |
20060200471 | Holland et al. | Sep 2006 | A1 |
20060200548 | Min | Sep 2006 | A1 |
Number | Date | Country |
---|---|---|
0 621 706 | Oct 1994 | EP |
0 621 706 | Oct 1994 | EP |
58181395 | Oct 1983 | JP |
WO-03023561 | Mar 2003 | WO |
WO 03023561 | Mar 2003 | WO |
Number | Date | Country | |
---|---|---|---|
20060200471 A1 | Sep 2006 | US |