Businesses provide services that require a large amount of computing resources at some point in the development or provision of the service. For example, a computer animation company may distribute rendering processing load to a number of computers to produce animations quickly. In another example, an online sales company may distribute incoming requests to a large number of computers acting as web servers to handle a larger traffic load than can be handled by a single computer. Typically, when a business utilizes a large number of computers for load distribution solutions, the computers are stored on racks in a data center. A conventional data center is a room, a floor, or sometimes even an entire building dedicated to housing computing systems configured to perform specific tasks.
One specific concern when designing a data center is heat management. As computers give off heat while operating, a data center may become hot if many computers are operating at the same time. Too much heat can lead to premature system failure. This may create undesirable costs for businesses including loss of revenue due to downtime and increased system repair expenses. Conventional data center thermal monitoring systems may utilize a sparse set of sensors spread throughout a data center. However, typical systems are often too sparse to accurately track heat data in specific areas of a data center. Thus, conventional systems may rely on exception reporting from the machines themselves to determine when heat is exceeding a prescribed threshold. However, at this point, it may be too late to take steps to prevent system failure and incur an undesirable cost. One reason why a data center may have a sparse array of sensors may be due to the difficulty in visually analyzing a large number of sensors in a data center simultaneously over a large period of time. See, for example,
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate various example systems, methods, and other example embodiments of various aspects of the invention. It will be appreciated that the illustrated element boundaries (e.g., boxes, groups of boxes, or other shapes) in the figures represent one example of the boundaries. One of ordinary skill in the art will appreciate that in some examples one element may be designed as multiple elements or that multiple elements may be designed as one element. In some examples, an element shown as an internal component of another element may be implemented as an external component and vice versa. Furthermore, elements may not be drawn to scale.
Example systems and methods associated with spatial temporal visual analysis of thermal data are described. One example method includes storing temperature data received from a set of sensors in a data center. For example,
The example method also includes displaying the multiple dimensions of data from the sensors in a two dimension graphic. The graphic may be displayed, for example, at a monitoring station outside of the data center.
Thus, two dimension graphic 200 displays a spatial temporal multi-dimensional visualization with physical locations, time series data, temperature sensor data, and so on. This may allow more accurate real time monitoring of heat patterns in the data center. Further, monitoring and storing temperature data in this manner may provide the unexpected utility of allowing data queries that aid in precisely identifying, diagnosing, and treating causes of heat anomalies that are detected.
The following includes definitions of selected terms employed herein. The definitions include various examples and/or forms of components that fall within the scope of a term and that may be used for implementation. The examples are not intended to be limiting. Both singular and plural forms of terms may be within the definitions.
References to “one embodiment”, “an embodiment”, “one example”, “an example”, and so on, indicate that the embodiment(s) or example(s) so described may include a particular feature, structure, characteristic, property, element, or limitation, but that not every embodiment or example necessarily includes that particular feature, structure, characteristic, property, element or limitation. Furthermore, repeated use of the phrase “in one embodiment” does not necessarily refer to the same embodiment, though it may.
ASIC: application specific integrated circuit.
CD: compact disk.
CD-R: CD recordable.
CD-RW: CD rewriteable.
DVD: digital versatile disk and/or digital video disk.
HTTP: hypertext transfer protocol.
LAN: local area network.
PCI: peripheral component interconnect.
PCIE: PCI express.
RAM: random access memory.
DRAM: dynamic RAM.
SRAM: synchronous RAM.
ROM: read only memory.
PROM: programmable ROM.
SQL: structured query language.
OQL: object query language.
USB: universal serial bus.
XML: extensible markup language.
WAN: wide area network.
“Computer component”, as used herein, refers to a computer-related entity (e.g., hardware, firmware, software in execution, combinations thereof). Computer components may include, for example, a process running on a processor, a processor, an object, an executable, a thread of execution, and a computer. A computer component(s) may reside within a process and/or thread. A computer component may be localized on one computer and/or may be distributed between multiple computers.
“Computer communication”, as used herein, refers to a communication between computing devices (e.g., computer, personal digital assistant, cellular telephone) and can be, for example, a network transfer, a file transfer, an applet transfer, an email, an HTTP transfer, and so on. A computer communication can occur across, for example, a wireless system (e.g., IEEE 802.11), an Ethernet system (e.g., IEEE 802.3), a token ring system (e.g., IEEE 802.5), a LAN, a WAN, a point-to-point system, a circuit switching system, a packet switching system, and so on.
“Computer-readable medium”, as used herein, refers to a medium that stores signals, instructions and/or data. A computer-readable medium may take forms, including, but not limited to, non-volatile media, and volatile media. Non-volatile media may include, for example, optical disks, magnetic disks, and so on. Volatile media may include, for example, semiconductor memories, dynamic memory, and so on. Common forms of a computer-readable medium may include, but are not limited to, a floppy disk, a flexible disk, a hard disk, a magnetic tape, other magnetic medium, an ASIC, a CD, other optical medium, a RAM, a ROM, a memory chip or card, a memory stick, and other media from which a computer, a processor or other electronic device can read.
In some examples, “database” is used to refer to a table. In other examples, “database” may be used to refer to a set of tables. In still other examples, “database” may refer to a set of data stores and methods for accessing and/or manipulating those data stores.
“Data store”, as used herein, refers to a physical and/or logical entity that can store data. A data store may be, for example, a database, a table, a file, a data structure (e.g. a list, a queue, a heap, a tree) a memory, a register, and so on. In different examples, a data store may reside in one logical and/or physical entity and/or may be distributed between two or more logical and/or physical entities.
“Logic”, as used herein, includes but is not limited to hardware, firmware, software in execution on a machine, and/or combinations of each to perform a function(s) or an action(s), and/or to cause a function or action from another logic, method, and/or system. Logic may include a software controlled microprocessor, a discrete logic (e.g., ASIC), an analog circuit, a digital circuit, a programmed logic device, a memory device containing instructions, and so on. Logic may include one or more gates, combinations of gates, or other circuit components. Where multiple logical logics are described, it may be possible to incorporate the multiple logical logics into one physical logic. Similarly, where a single logical logic is described, it may be possible to distribute that single logical logic between multiple physical logics.
An “operable connection”, or a connection by which entities are “operably connected”, is one in which signals, physical communications, and/or logical communications may be sent and/or received. An operable connection may include a physical interface, an electrical interface, and/or a data interface. An operable connection may include differing combinations of interfaces and/or connections sufficient to allow operable control. For example, two entities can be operably connected to communicate signals to each other directly or through one or more intermediate entities (e.g., processor, operating system, logic, software). Logical and/or physical communication channels can be used to create an operable connection.
“Query”, as used herein, refers to a semantic construction that facilitates gathering and processing information. A query may be formulated in a database query language (e.g., SQL), an OQL, a natural language, and so on.
“Signal”, as used herein, includes but is not limited to, electrical signals, optical signals, analog signals, digital signals, data, computer instructions, processor instructions, messages, a bit, a bit stream, and so on, that can be received, transmitted and/or detected.
“Software”, as used herein, includes but is not limited to, one or more executable instruction that cause a computer, processor, or other electronic device to perform functions, actions and/or behave in a desired manner. “Software” does not refer to stored instructions being claimed as stored instructions per se (e.g., a program listing). The instructions may be embodied in various forms including routines, algorithms, modules, methods, threads, and/or programs including separate applications or code from dynamically linked libraries.
“User”, as used herein, includes but is not limited to one or more persons, software, logics, computers or other devices, or combinations of these.
Some portions of the detailed descriptions that follow are presented in terms of algorithms and symbolic representations of operations on data bits within a memory. These algorithmic descriptions and representations are used by those skilled in the art to convey the substance of their work to others. An algorithm, here and generally, is conceived to be a sequence of operations that produce a result. The operations may include physical manipulations of physical quantities. Usually, though not necessarily, the physical quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a logic, and so on. The physical manipulations create a concrete, tangible, useful, real-world result.
It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, and so on. It should be borne in mind, however, that these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise, it is to be appreciated that throughout the description, terms including processing, computing, determining, and so on, refer to actions and processes of a computer system, logic, processor, or similar electronic device that manipulates and transforms data represented as physical (electronic) quantities.
Example methods may be better appreciated with reference to flow diagrams. For purposes of simplicity of explanation, the illustrated methodologies are shown and described as a series of blocks. However, it is to be appreciated that the methodologies are not limited by the order of the blocks, as some blocks can occur in different orders and/or concurrently with other blocks from that shown and described. Moreover, less than all the illustrated blocks may be required to implement an example methodology. Blocks may be combined or separated into multiple components. Furthermore, additional and/or alternative methodologies can employ additional, not illustrated blocks.
Method 300 also includes, at 340, displaying a subset of the set of temperature data in a two dimension graphic on a computer display. The computer display may be located at a monitoring station that is external to the data center. Members of the subset of the set of temperature data associated with sensors having a common position on a first axis associated with the fixed arrangement may be geometrically related in a first direction in the two dimension graphic. The first direction may be horizontally in the graphic. Members of the subset of the set of temperature data associated with sensors having a common position on a second axis associated with the fixed arrangement may be geometrically related in a second direction in the two dimension graphic. The second direction may be vertically in the graphic. Members of the subset of the set of temperature data associated with sensors having a common position on the second axis and having a common position on a third axis associated with the fixed arrangement may be geometrically related in a subdivision of the second direction in the two dimension graphic. Members of the subset of the set of temperature data associated with a sensor may be arranged chronologically. In one example, a set of temperature data associated with a sensor may be arranged chronologically from left to right. In this example, the leftmost bottom position in the graphic may be associated with an earliest member of the set of temperature data associated with the sensor. The rightmost top position in the graphic may be associated with a latest member of the set of temperature data associated with the sensor. A value associated with a member of the set of temperature data may be represented in the two dimension graphic by a color scale from low temperature (green) to medium (yellow), and to high (red).
While
In one example, a method may be implemented as computer executable instructions. Thus, in one example, a computer-readable medium may store computer executable instructions that if executed by a machine (e.g., processor) cause the machine to perform a method. While executable instructions associated with the above method are described as being stored on a computer-readable medium, it is to be appreciated that executable instructions associated with other example methods described herein may also be stored on a computer-readable medium.
Method 400 includes, at 430, receiving a display query. The display query may identify a set of requested temperature data associated with selected members of the set of sensors. The set of sensors may be located in a single rack and/or in different racks. Thus, displaying temperature data at 440 may be performed in response to receiving the display query. The display query may be received from a querying agent. The query agent may be one of, an Administrator, a logic, and so on. The display query may identify a subset of the sensors and a time period regarding which the querying agent desires temperature data. An example graphic produced in response to the query is presented in
Method 400 also includes, at 450, receiving a visual analytic query. The visual analytic query may seek a probability of a relationship existing between multiple subsets of the set of temperature data. For example, the visual analytic query may seek a sensor that is determined to be a most closely related sensor to a selected sensor. The visual analytic query may also seek information regarding sensors surrounding a selected sensor. The visual analytic query may be received from a querying agent. In one example, the querying agent may be a user. Thus, the visual analytic query may be generated by a user selecting a portion of the two dimension graphic. In one example, the user may select a portion of the two dimension graphic by drawing a selection rectangle over a desired portion. The visual analytic query may identify a first sensor, a second sensor, and a time period. Method 400 also includes, at 460, calculating a set of correlation data. Calculating the set of correlation data may include computing a probability of a relationship existing between temperature data associated with the first sensor over the time period and temperature data associated with the second sensor over the same time period. The first sensor may be a selected sensor and the second sensor may be a sensor most related to the selected sensor. In one example a relationship may be considered to exist if temperature data associated with the first sensor and temperature data associated with the second sensor exhibit similar temperature values. In another example, a relationship may be considered to exist if temperature data associated with the first sensor and temperature data associated with the second sensor exhibit similar simultaneous changes in temperature. This type of relationship may indicate an overall data center temperature change. In still another example, a relationship may be considered to exist if temperature data associated with the first sensor and temperature data associated with the second sensor exhibit similar changes in temperature on a time delay. This type of relationship may facilitate detecting causal heat relationships in the data center. Method 400 also includes, at 470, providing the set of correlation data. In one example, the set of correlation data may be provided to the querying agent. An example graphical output of correlation data is presented in
In one example, a marker may automatically identify abnomal data points such as hotspots. This may guide an administrator to anomalies. After locating an anomaly, the administrator may select a marked area using a graphical query. The query may initiate a calculation to determine relationships between attributes associated with the selected data anomaly and attributes associated with sensors near the sensor with which the anomaly is associated. In one example, the query results may be presented in a graphic that allows the administrator to further refine results. This may allow the administrator to determine thermal correlations between the selected sensor and nearby sensors.
By way of illustration, it is possible that a piece of paper may become trapped in a computer in a rack in a data center. Conventional systems with few sensors may not be able to determine that this has occurred until the computer itself has overheated and an exception has been reported. In the lucky event that a sensor has been placed near enough to the computer that a change is detected, a data center administrator may still be unable to determine the specific source of the problem as the sensor would be responsible for detecting general temperatures for a large portion of the data center whereas only one computer is exhibiting abnormal behavior. Multiple sensors spread throughout a data center on a large portion of the racks would show that only computers in the one specific area are having an issue as the two dimension graphic would show that the rest of the data center is operating normally. This would allow the administrator to specifically look for something causing a highly localized heat anomaly. In another example, servers may not be stacked properly in a rack. For example, there may be gaps between servers that lead to mixing of hot and cold air. Example systems and methods may facilitate identifying such improper stacking.
In another example, a portion of a data center's cooling system may fail. While detectable, conventional systems using few sensors may take a significant amount of time to register the change if a sparse array of sensors does not cover the area of the origination of the anomaly. By using a large number of sensors on multiple racks, temperature values may quickly begin to fluctuate and may precisely identify the source of the anomaly as temperatures recorded closer to the failed portion of the cooling system may be higher than those farther away.
System 500 may also include a display logic 530. Display logic 530 may display a subset of the set of temperature data as a graphic having two dimensions. Members of the set of temperature data sharing a value of one of the multiple characteristics may share a feature on the two dimension graphic: For example, temperature data associated with sensors sharing a location on the first dimension may share a column in the two dimension graphic, and so on.
System 600 includes a correlation logic 640. Correlation logic 640 may calculate a set of correlation data in response to receiving a visual analytic query. The visual analytic query may identify a first member of the set of sensors, a second member of the set of sensors, and a time period. The first sensor may be a selected sensor, and the second sensor may be a sensor most related to the selected sensor. Calculating the set of correlation data may include computing a probability of a relationship existing between temperature data associated with the first sensor over the time period and temperature data associated with the second sensor over the time period. Correlation logic 640 may also provide the set of correlation data. Providing the set of correlation data may include sending a signal, outputting a graphic, changing a value, updating a database, and so on. In addition to indicating correlations, metrics can be displayed and analyzed to identify inefficiencies in data centers.
Thus, logic 730 may provide means (e.g., hardware, software, firmware) for recording a set of temperature data from a set of sensors, where the set of temperature data includes three or more dimensions of data. Logic 730 may also provide means (e.g., hardware, software firmware) for displaying a subset of the set of temperature data, where the three or more dimensions of data are arranged on a two dimension display. The means associated with logic 730 may be implemented, for example, as an ASIC. The means may also be implemented as computer executable instructions that are presented to computer 700 as data 716 that are temporarily stored in memory 704 and then executed by processor 702.
Generally describing an example configuration of the computer 700, the processor 702 may be a variety of various processors including dual microprocessor and other multi-processor architectures. A memory 704 may include volatile memory and/or non-volatile memory. Non-volatile memory may include, for example, ROM, PROM, and so on. Volatile memory may include, for example, RAM, SRAM, DRAM, and so on.
A disk 706 may be operably connected to the computer 700 via, for example, an input/output interface (e.g., card, device) 718 and an input/output port 710. The disk 706 may be, for example, a magnetic disk drive, a solid state disk drive, a floppy disk drive, a tape drive, a Zip drive, a flash memory card, a memory stick, and so on. Furthermore, the disk 706 may be a CD-ROM drive, a CD-R drive, a CD-RW drive, a DVD ROM drive, a Blu-Ray drive, an HD-DVD drive, and so on. The memory 704 can store a process 714 and/or a data 716, for example. The disk 706 and/or the memory 704 can store an operating system that controls and allocates resources of the computer 700.
The bus 708 may be a single internal bus interconnect architecture and/or other bus or mesh architectures. While a single bus is illustrated, it is to be appreciated that the computer 700 may communicate with various devices, logics, and peripherals using other busses (e.g., PCIE, 1394, USB, Ethernet). The bus 708 can be types including, for example, a memory bus, a memory controller, a peripheral bus, an external bus, a crossbar switch, and/or a local bus.
The computer 700 may interact with input/output devices via the i/o interfaces 718 and the input/output ports 710. Input/output devices may be, for example, a keyboard, a microphone, a pointing and selection device, cameras, video cards, displays, the disk 706, the network devices 720, and so on. The input/output ports 710 may include, for example, serial ports, parallel ports, and USB ports.
The computer 700 can operate in a network environment and thus may be connected to the network devices 720 via the i/o interfaces 718, and/or the i/o ports 710. Through the network devices 720, the computer 700 may interact with a network. Through the network, the computer 700 may be logically connected to remote computers. Networks with which the computer 700 may interact include, but are not limited to, a LAN, a WAN, and other networks.
While example systems, methods, and so on have been illustrated by describing examples, and while the examples have been described in considerable detail, it is not the intention of the applicants to restrict or in any way limit the scope of the appended claims to such detail. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the systems, methods, and so on described herein. Therefore, the invention is not limited to the specific details, the representative apparatus, and illustrative examples shown and described. Thus, this application is intended to embrace alterations, modifications, and variations that fall within the scope of the appended claims.
To the extent that the term “includes” or “including” is employed in the detailed description or the claims, it is intended to be inclusive in a manner similar to the term “comprising” as that term is interpreted when employed as a transitional word in a claim.
To the extent that the term “or” is employed in the detailed description or claims (e.g., A or B) it is intended to mean “A or B or both”. When the applicants intend to indicate “only A or B but not both” then the term “only A or B but not both” will be employed. Thus, use of the term “or” herein is the inclusive, and not the exclusive use. See, Bryan A. Garner, A Dictionary of Modern Legal Usage 624 (2d. Ed. 1995).
To the extent that the phrase “one or more of, A, B, and C” is employed herein, (e.g., a data store configured to store one or more of, A, B, and C) it is intended to convey the set of possibilities A, B, C, AB, AC, BC, ABC, AAA, AAB, AABB, AABBC, AABBCC, and so on (e.g., the data store may store only A, only B, only C, A&B, A&C, B&C, A&B&C, A&A&A, A&A&B, A&A&B&B, A&A&B&B&C, A&A&B&B&C&C, and so on). It is not intended to require one of A, one of B, and one of C. When the applicants intend to indicate “at least one of A, at least one of B, and at least one of C”, then the phrasing “at least one of A, at least one of B, and at least one of C” will be employed.