The ever increasing availability of high throughput computer network connections has enabled computer processing capability to be distributed among many different computing devices that can be spread out across a variety of physical locations. For example, data centers, housing hundreds or thousands of computing devices, are becoming more commonplace, both among entities that seek to utilize for themselves the processing capabilities supported by such datacenters, and by entities that seek to sell such processing capabilities to others. Irrespective of the manner in which data centers are monetized, each data center, and the computing devices and associated hardware contained therein, can represent a substantial financial investment. More specifically, much of the hardware that comprises a data center, especially the computational hardware, can, not only, require an initial outlay of capital to purchase such hardware, but can also represent a depreciating asset whose value decreases over time.
Consequently, it can be financially beneficial to track hardware to ensure that it is being utilized in an efficient manner. It can also be financially beneficial to track hardware for purposes of improving future estimates of capital expenditures and other like financial investments in such hardware resources. Unfortunately, tracking and managing a myriad of hardware across diverse geographic locations can be difficult to implement. For example, a single data center can comprise thousands of computing devices and associated hardware that can need to be individually tracked and managed. Many organizations, however, can manage multiple data centers that can be spread across diverse geographic locations, exponentially increasing the amount of hardware to be tracked.
Traditional mechanisms for tracking and managing hardware, especially large volumes of physically distributed hardware, comprise centralized mechanisms that aggregate the relevant information into a single centralized database. Such mechanisms can be inefficient, as there can be communicational delays in communicating with centralized mechanisms that may be located in a location geographically distant from the location of the hardware being managed. Additionally, such mechanisms can be inefficient due to the cost of implementing and maintaining the hardware needed to implement such centralized mechanisms in the first place.
In one embodiment, hardware asset management can comprise a manager managing a defined set of managed assets, where both the manager and the managed assets can comprise information about one another. The manager can maintain information regarding the managed assets including detailed identification information. Additionally, the managed assets can maintain identification information, and, optionally, additional information, regarding the manager that is managing them.
In another embodiment, if one of the managed assets is replaced, such as due to an upgrade, or a replacement of a failed asset, communications can be exchanged between the manager and the replacement managed asset so that the manager can obtain identifying information, and, optionally, other detailed information, about the replacement managed asset, and so that the replacement managed asset can obtain an identification, and, optionally, other detailed information, about the manager.
In a further embodiment, if the asset manager is replaced, such as due to an upgrade, or a replacement of a failed asset manager, initially, the replacement asset manager can broadcast a request for identifying information, and, optionally, other detailed information, to the assets it is managing. Additionally, periodic communications between the assets being managed and the asset manager can include a request, by each of the assets, individually, for identifying information, and, optionally, other detailed information, about the replacement asset manager. To ensure that each asset it is managing recognizes the replacement asset manager, the replacement asset manager can request, of the assets, an identification of the asset manager that those assets believe is their asset manager. If such an identification does not match that of the asset manager initiating such a request, corrective action can be performed.
In a still further embodiment, an asset manager can maintain state information for each asset it is managing, the state information indicating a state that the asset is in, and informing a subsequent action to be undertaken by the asset manager with respect to each such asset. The asset manager can establish appropriate states given the asset being managed, such that simple assets can comprise only two or three states, such as a “power on” or “power off” state, or a further subdivision of the “power on” state into an “operational” state and a “failed” state. More complex assets can be associated with additional, more detailed, intermediate states. Periodic communications between the asset manager and the assets being managed can enable the asset manager to update the state information for individual assets it is managing and inform the next periodic communication.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Additional features and advantages will be made apparent from the following detailed description that proceeds with reference to the accompanying drawings.
The following detailed description may be best understood when taken in conjunction with the accompanying drawings, of which:
The following description relates to state-based decentralized hardware asset management comprising both managed assets and a manager managing those assets. Both the manager and the managed assets can comprise information about one another. If one of the managed assets is replaced, communications can be exchanged between the manager and the replacement managed asset so that the manager can obtain identifying information, and, optionally, other detailed information, about the replacement managed asset, and so that the replacement managed asset can obtain an identification, and, optionally, other detailed information, about the manager. If the asset manager is replaced, initially, the replacement asset manager can broadcast a request for identifying information, and, optionally, other detailed information, to the assets it is managing. Additionally, periodic communications between the assets being managed and the asset manager can include a request, by each of the assets, individually, for identifying information, and, optionally, other detailed information, about the replacement asset manager. To ensure that each asset recognizes the replacement asset manager, the replacement asset manager can request, of the assets, an identification of the asset manager that those assets believe is their asset manager. An asset manager can also maintain state information for each asset it is managing, the state information indicating a state that the asset is in, and informing a subsequent action to be undertaken by the asset manager with respect to each such asset. The state information also enables an asset manager to optimize performance, such as by avoiding the monitoring or attempted performance of actions that are incompatible with a current state of an asset.
For purposes of illustration, the techniques described herein are directed to specific types of hardware, such as rack-mount server computing devices. However, references to, and illustrations of, such hardware are strictly exemplary and are not intended to limit the mechanisms described to the specific examples provided. Indeed, the techniques described are applicable to any hardware asset that is individually purchasable and replaceable.
Additionally, although not required, the description below will be in the general context of computer-executable instructions, such as program modules, being executed by one or more computing devices. More specifically, the description will reference acts and symbolic representations of operations that are performed by one or more computing devices or peripherals, unless indicated otherwise. As such, it will be understood that such acts and operations, which are at times referred to as being computer-executed, include the manipulation by a processing unit of electrical signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in memory, which reconfigures or otherwise alters the operation of the computing device or peripherals in a manner well understood by those skilled in the art. The data structures where data is maintained are physical locations that have particular properties defined by the format of the data.
Generally, program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the computing devices need not be limited to conventional personal computers, and include other computing configurations, including hand-held devices, multi-processor systems, microprocessor based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. Similarly, the computing devices need not be limited to a stand-alone computing device, as the mechanisms may also be practiced in distributed computing environments linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
With reference to
In one embodiment, the manager 130 can comprise an asset management table 140 that can comprise information about the assets that the manager 130 is managing. For example, the asset management table 140 can comprise a unique identifier for each of the managed assets 111, 112 and 113 that the manager 130 can be managing at a point in time. Additionally, in one embodiment, the asset management table 140 can comprise further information, such as further identifying information, or other further asset tracking and utilization information. For example, the asset management table 140 can comprise information indicative of the manufacturer of an asset, the model number of an asset, the date of manufacture of an asset, the version number of an asset, any asset options that may be installed or set on an asset, and other like information. As will be recognized by those skilled in the art, such additional information is often colloquially referred to as Field Replaceable Unit information, or FRU information.
The management of the assets 111, 112 and 113, by the manager 130, can, in one embodiment, include periodic communications from the manager 130 to the assets 111, 112 and 113. Such periodic communications can include instructions that the manager 130 desires the managed assets 111, 112 and 113 to carry out, as well as status checks or verifications that the manager 130 seeks to perform. In response to such periodic communications, the manager 130 can receive information from the managed assets 111, 112 and 113, including, for example, confirmation that an instruction was carried out, information obtained by the carrying out of the instruction, status information, or other like information.
In one embodiment, as will be described in further detail below, the manager 130 can also maintain information regarding a state of one or more of the assets it is managing such as, for example, the managed assets 111, 112 and 113 shown in the system 100 of
A response to the communication 151, can be provided individually, by one or more of the managed assets 111, 112 and 113. Such a response is illustrated in
In some instances, however, it can be advantageous to replace an asset that is being managed by the manager 130, or to assign new assets to be managed by the manager 130. For example, in the exemplary system 100 of
To provide decentralized hardware asset management, in one embodiment, the assets being managed can comprise an identification of the manager managing them, or, more specifically, of the manager they believe is currently managing them. Thus, for example, as illustrated in the exemplary system 100 of
When a new managed asset, such as the asset 114, is added to the assets that are being managed by the manager 130, such as by replacing an existing asset 113, as illustrated by the replacement action 160, the new asset 114 can comprise a table 124 that can lack an identification of the manager 130, since the asset 114 would not be aware that the manager 130 was managing it. Consequently, in one embodiment, the manager 130 can request that the asset 114 provide it with an identification of the manager that the asset 114 believes is currently managing it. In an alternative embodiment, illustrated in the system 100 of
In one embodiment, the updating, by an asset, of its locally-maintained table comprising identification of the manager which that asset believes is managing it can occur as part of a voting process among the assets. Such a voting process will be described in further detail below. In another embodiment, however, under certain circumstances, assets can update their locally-maintained manager identifiers on their own and without external support. For example, assets whose tables are blank, or otherwise indicate that the asset either was just installed or was not previously managed, can initialize such a table with whichever manager identifier such an asset first receives. Thus, as illustrated by the action 183 in
Turning to
Additionally, as described in detail above, each of the assets, such as the assets 111, 112 and 113, can send a request, as illustrated by the communications 251, to the new manager 230, requesting identification information, or, optionally, additional information, about the new manager 230. In response to each of such communications 251, the new manager 230 can, individually, transmit a response providing the requested information, as illustrated by the communications 252. Since the manager was changed from the manager 130 to the new manager 230, upon receiving the communications 252, each of the managed assets 111, 112 and 113 can, individually, compare the manager identifier received via the communications 252 to the manager identifier stored in their tables 121, 122 and 123, respectively, and can determine that the identifiers do not match.
In one embodiment, in such an instance, where an asset receives a manger identifier that differs from the manager identifier it has retained locally, a voting protocol can be utilized to ensure that all of the assets being managed by a manager agree as to which device is acting as their manager. Such a voting protocol can be triggered, as indicated, by one or more of the assets detecting that the manager information received, such as via the communications 252, differs from the manager information currently retained at the asset. For example, an asset making a determination that the most recently received manager identifier differs from the manager identifier which that asset has stored in its table can transmit communications to the other assets initiating a voting protocol. Subsequently, each of the assets can transmit the manager identifier they have locally stored to the manager. Alternatively, rather than sending communications to the other assets, an asset determining that a received manager identifier differs from a locally-stored manager identifier can transmit communications to the manager, requesting that the manager initiate the voting protocol. As yet another alternative, a voting protocol can be triggered by the new manager 230 on its own, such as either as a periodic event, or as part of the updating of the asset management table 240. As to the latter, in one embodiment, if the new manager 230 detects that more than a threshold portion of the asset management table 240 was replaced, or reflects new assets, the new manager 230 can initiate a voting protocol.
The system 200 of
Irrespective of the manner in which the voting protocol is triggered, the voting protocol can, in one embodiment, comprise the transmission, by each of the assets 111, 112 and 113, to the new manager 230, of the information currently stored in the tables 121, 122 and 123 of those assets. As indicated, such information can comprise an identification of the manager that each of those assets, individually, believes is currently managing them. The transmission of the manager identification, individually, by each of the assets 111, 112 and 113 is illustrated in the system 200 of
Upon receiving the communications 262 the new manager 230 can compare the manager identifier from each of the communications 262 to its own identifier to ensure that each of the assets 111, 112 and 113 believes it is being managed by the new manager 230, and has an identification of the new manager 230 stored in their respective tables, namely the tables 121, 122 and 123. Such a comparison is represented by the action 263, shown in the system 200 of
Turning to
Although described more fully with reference to the state diagram 400 of
In one embodiment, if a change of the assets that are being managed is detected at step 330, processing can proceed at step 335. For example, as illustrated previously, one such change, which can be detected at step 330, can be a replacement of a managed asset with a new managed asset. Other like changes can also be detected. In response, at step 335, an identifier can be requested from one or more of the managed assets. In one embodiment, rather than conditioning the performance of step 335, and the requesting of an identifier from one or more managed assets, only on the detection of a change of those managed assets, step 335 can, instead, simply be performed periodically irrespective of whether the manager has any input suggestive of a change of the managed assets. In one optimization, a check can be made, such as at step 340, to determine whether the identifiers returned by the managed assets, in response to the request sent at step 335, are the same as the identifiers of the managed assets of which the asset manager is previously aware, such as the identifiers already stored in an asset management table that can be maintained by the asset manager. If such a check, at step 340, is made, and it is determined that the asset identifiers that were returned are the same, at least for some assets, then, in one embodiment, at least for those assets, processing can proceed directly to step 350, and need not perform step 345. Conversely, if, at step 340, it is determined that the asset identifiers of at least some assets are different, such a determination can indicate that those assets changed and, subsequently, at step 345, detailed identification information can be requested from those assets. As indicated previously, such detailed identification information can include an identification of the manufacturer of such assets, the serial number of such assets, the model number of such assets, any installed options that such assets may have, and other like detailed identification information. In other embodiments, rather than performing steps 335, 340 and 345, the detailed identification information requested at step 345 can be performed without first requesting an identifier, such as at step 335.
In addition to obtaining identification information, such as an identifier or the detailed identification information described above, an asset manager can also provide its own identification information to the assets it is managing. Thus, at step 350, the asset manager can provide its own identification information to one or more of the assets it is managing. For example, in one embodiment, at step 350, the asset manager can provide its own identification information only to those one or more assets that the asset manager determined have changed, such as through the execution of steps 330, 335 and 340. In another embodiment, at step 350, the asset manager can provide its own identification information to some or all of the assets it is managing irrespective of whether or not the asset manager has received input suggestive that such assets have changed. In yet another embodiment, the provision of manager identification information to managed assets, such as at step 350, can be in response to an explicit query for such information from one or more of the managed assets. In such an embodiment, assets can periodically request manager identification information, such as is indicated by utilization of the term “heartbeat” in the illustration of
While the steps 330 through 355 have been described above in the context of a change of one or more of the assets being managed, another type of change that can be detected, such as at step 330, can be a change of the asset manager itself. More specifically, in one embodiment, when a new asset manager is instantiated, it can proceed directly with step 335 since such a new asset manager may not comprise sufficient information to perform steps 315 through 325 without first performing at least one of steps 335 and 345. Consequently, in such an embodiment, a new asset manager can request identifiers from the assets it is managing, such as at step 335, and can, optionally, request detailed identification information from those assets, such as at step 345. In one embodiment, managed assets and asset managers can be communicationally coupled to one another through dedicated communicational links, such as through serial communication cables, or other like communicational infrastructure. In such an embodiment, a new asset manager can perform step 335, or step 345, if performed directly without first performing steps 335 and 340, by simply broadcasting such a request to all devices communicationally coupled to the same dedicated communicational link as the new asset manager. In such a manner, the asset manager can discover all of the assets that it is to manage, since those assets would be communicationally coupled to the new asset manager, such as through such dedicated, or out of band, communicational infrastructure.
Once a new asset manager has completed steps 335 through 350, described in detail above, it can, optionally, in one embodiment, initiate a voting among the managed assets to ensure that those managed assets correctly recognize the new asset manager as their asset manager. For example, as described above, upon receiving the manager identifier that the manager transmitted at step 350, one or more of the assets can detect that such a manager identifier differs from the manager identifier that such an asset has stored. In response, those one or more assets can request that the manager initiate a voting protocol. Such a request can be received at step 355. In response, the manager can request the manager identifier from the assets it is managing, such as is illustrated by step 360. In another embodiment, the voting can be initiated by the new asset manager itself, such as by directly performing step 360, without receiving a request at step 355. Consequently, to illustrate that step 355 is optional, it is shown in
Irrespective of the manner in which voting is triggered, in one embodiment, as part of the voting, each of the assets can transmit, to the new asset manager, an identifier of the asset manager that those assets believe is managing them, namely the asset manager identifier that those assets have stored locally. At step 365, the asset manager can compare the asset manager identifier received from one or more of the assets with its own identifier to verify that those assets recognize the new asset manager as their asset manager. If the identifiers are the same, then the assets do recognize the new asset manager as their asset manager, and processing can proceed with the management of those assets, such as in the manner represented by steps 315 through 325, described in detail above, and shown in the exemplary flow diagram 300 of
Turning to
Turning to the exemplary state diagram 400 shown in
If, through, one of the period checks, the asset manager detects the application of power 415, then, in one embodiment, the asset can be transitioned to an initialize state 450. Such an initialize state 450 can represent a temporary, or transitory, state that more complex assets, such as servers, can transition through, and which can be appropriate for such assets. Other assets, such as fans, may not need to be associated with an initialize state and can, instead, transition directly from a powered off state to one of the powered on states, such as the healthy state 420 which will be described in detail below. If an asset is in the initialize state 450, an asset manager can detect either a removal of power 451, an initialization failure 453 or an initialization success 454. If the asset is powered off, or loses power, while it is still in the initialize state 450, the asset manager can detect a removal of power 451 and can change the state of the asset from the initialize state 450 to the powered off state 410. As will be understood by those skilled in the art, the detection of the removal of power 451, by the asset manager, can be based on periodic checking of the asset, or through continuous monitoring thereof. Irrespectively, the existence of the asset in the initialize state 450 can inform the behavior of the asset manager with respect to that state, namely that the asset manager can continue to monitor the state such as to, for example, detect the removal of power 451, or to detect other events, such as an initialization failure 453, or initialization success 454. Additionally, knowledge that the asset is in the initialize state 450 can enable the asset manager to optimize its performance. For example, the above-referenced monitoring can occur at a reduced frequency. Additionally, as another performance optimization, the asset manager can avoid transmitting instructions or queries to an asset, which the asset manager has associated with the initialize state 450, that are incompatible with an asset in such a state. In such a manner, the asset manager can optimize its performance. As illustrated in the exemplary state diagram 400 of
As in the case of the initialize state 450, the transition state 440 can, in one embodiment, represent a temporary state through which more complex assets, such as servers, can transition through, which can be appropriate for such assets. For example, a transition state can represent a state after an asset has properly initialized, but before the asset manager has verified proper operation of the asset. Thus, for example, from the transition state 440, an asset manager can cause the assets to perform an operation, such as to read a sensor, and such a successful sensor read 442 can cause the asset manager to transition the asset from the transition state 440 to the healthy state 420. As before, the state of the asset can inform the action taken by the asset manager with respect to such asset. For example, the asset manager can attempt to read sensors of assets that are in the transition state 440. In such an example, the existence of the asset in the transition state 440 informs the action of the asset manager, namely the attempting of the sensor read. Additionally, in such an example, the existence of the asset in the transition state 440 can enable the asset manager to optimize its operation, such as by only attempting the sensor read, or changing the manner in which such a sensor read is performed to avoid inefficiencies. Should the read sensor operation not be successful, such a read sensor failure 443 can cause the asset manager to transition the asset from the transition state 440 to a fail state 430. Another action that can be undertaken by the asset manager with respect to an asset in the transition state 440 can be to monitor the asset to determine whether the asset still has power. If the asset manager detects a removal of power 441, it can change the asset from the transition state 440, back to the powered off state 410.
An asset in the healthy state 420 can be performing properly and can be managed properly by the asset manager. In one embodiment, such a proper performance and management can entail the reading of sensors of the asset. For example, simple assets, such as fans, can comprise sensors that can report fan speed, power consumption, temperature, and other like metrics. More complex assets, such as, for example, server computing devices, can comprise a myriad of other sensors including temperature sensors, various component sensors, such as hard disk drive sensors, and other like detection mechanisms. An asset can be maintained in the healthy state 420 if the reading of sensors, such as by the asset manager, are successful. Thus, as illustrated by the exemplary state diagram 400 of
In one embodiment, a failure state, such as the fail state 430, can entail a state in which an asset may have failed one or more tasks, but may otherwise be operating properly. For example, the failure to read a sensor can cause an asset to be assigned the fail state 430. As will be recognized by those skilled in the art, however, sensor failures can often occur due to timeout conditions, where the sensor may be operating, but the asset cannot read such a sensor within an allocated amount of time. As will also be recognized by those skilled in the art, such conditions can often be temporary, and can be due to a particular set of circumstances, that may resolve itself, such as without further action on the part of the asset manager. Consequently, in one embodiment, for assets classified in the fail state 430, the asset manager can attempt to perform a simpler operation such as, for example, the reading of an identifier. If the reading of an identifier is successful, the asset manager can transition the asset from the fail state 430 to the transition state 440, as illustrated by the successful reading of the identifier 434 in the state diagram 400. In the transition state 440, the asset manager can then, again, attempt to read the sensor and, if such a reading is successful, the asset can be transitioned back to the healthy state 420 as illustrated by the successful sensor read 442, which was described previously. Conversely, if the re-attempted reading of the sensor is not successful, the asset can be transitioned back to the fail state 430, as illustrated by the sensor read failure 443, which was also described previously.
If the reading of an identifier is, itself, unsuccessful, than the failed sensor read 433 can cause the assets to be maintained in the fail state 430. In one embodiment, after an excessive number of such failures are detected, the asset manager can instruct the asset to initialize 435, which can transition the asset back to the initialize state 450, which was described in detail above. Again, as before, the existence of an asset in the fail state 430, coupled with the detection of an excessive quantity of failures, can inform the actions of the asset manager with respect to such an asset, namely it can cause the asset manager, in the present example, to initialize 435 the asset. As before, the asset manager can continue to monitor assets for the removal of power 431, including for assets in the failed state 430. Should such a removal of power 431 be detected, the asset manager can transition the asset back to the powered off state 410.
Turning to
The computing device 500 also typically includes computer readable media, which can include any available media that can be accessed by computing device 500. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computing device 500. Computer storage media, however, does not include communication media. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
When using communication media, the computing device 500 may operate in a networked environment via logical connections to one or more remote computers. The logical connection depicted in
Among computer storage media, the system memory 530 comprises computer storage media in the form of volatile and/or nonvolatile memory, including Read Only Memory (ROM) 531 and Random Access Memory (RAM) 532. A Basic Input/Output System 533 (BIOS), containing, among other things, code for booting the computing device 500, is typically stored in ROM 531. RAM 532 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 520. By way of example, and not limitation,
The computing device 500 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
As can be seen from the above descriptions, decentralized hardware asset management has been presented. In view of the many possible variations of the subject matter described herein, we claim as our invention all such embodiments as may come within the scope of the following claims and equivalents thereto.