In a typical client-server environment, the server manages a set of resources and provides the ability to the clients to find and interact with a resource. For example, a file server provides the ability for users to store and look up files on the server. In some cases, numerous incoming requests from clients to the server can cause resource contention resulting in reduced system throughput and a degraded client experience.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
A feedback loop is created between a server and clients that provides the clients with health information of the server to assist in client-server traffic control. Health information is calculated for the server that measures a current health of the server. The health information is automatically provided to a client by the server in response to a request made by the client. The clients can utilize the received health information to determine when to request resources from the server.
Referring now to the drawings, in which like numerals represent like elements, various embodiment will be described. In particular,
Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Other computer system configurations may also be used, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. Distributed computing environments may also be used where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Referring now to
A basic input/output system containing the basic routines that help to transfer information between elements within the computer, such as during startup, is stored in the ROM 10. The computer 100 further includes a mass storage device 14 for storing an operating system 16, application program(s) 24, other program modules 25, and traffic manager 26 which will be described in greater detail below.
The mass storage device 14 is connected to the CPU 5 through a mass storage controller (not shown) connected to the bus 12. The mass storage device 14 and its associated computer-readable media provide non-volatile storage for the computer 100. Although the description of computer-readable media contained herein refers to a mass storage device, such as a hard disk or CD-ROM drive, the computer-readable media can be any available media that can be accessed by the computer 100.
By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, Erasable Programmable Read Only Memory (“EPROM”), Electrically Erasable Programmable Read Only Memory (“EEPROM”), flash memory or other solid state memory technology, CD-ROM, digital versatile disks (“DVD”), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 100.
Computer 100 operates in a networked environment using logical connections to remote computers through a network 18, such as the Internet. The computer 100 may connect to the network 18 through a network interface unit 20 connected to the bus 12. The network connection may be wireless and/or wired. The network interface unit 20 may also be utilized to connect to other types of networks and remote computer systems. The computer 100 may also include an input/output controller 22 for receiving and processing input from a number of other devices, including a keyboard, mouse, or electronic stylus (not shown in
Carrier network 28 is a network responsible for communicating with mobile devices 29. The carrier network 28 may include both wireless and wired components. For example, carrier network 28 may include a cellular tower that is linked to a wired telephone network. Typically, the cellular tower carries communication to and from mobile devices, such as cell phones, notebooks, pocket PCs, long-distance communication links, and the like. Gateway 27 routes messages between carrier network 28 and IP Network 18.
As mentioned briefly above, a number of program modules and data files may be stored in the mass storage device 14 and RAM 9 of the computer 100, including an operating system 16 suitable for controlling the operation of a computer, such as WINDOWS SERVER® or the WINDOWS 7® operating system from MICROSOFT CORPORATION of Redmond, Wash. The mass storage device 14 and RAM 9 may also store one or more program modules. In particular, the mass storage device 14 and the RAM 9 may store one or more application programs 24 and program modules 25.
Traffic manager 26 is configured to determine health information for computer 100 using the health monitor and provide the health information to clients. The clients can use the health information in determining when to send requests to the server. Generally, a client (such as client 17) sends a request to the server (i.e. computer 100) to perform an action such as a request for a resource. The server handles the request and along with the response to the client also sends the current health information for the server to the client. The health monitor is configured to determine a health score of the server based on performance counter values associated with the computer. According to one embodiment, an average, such as an Exponential Moving Average (EMA) is used to smooth out the spikes in the performance counter values while still giving more weight to the recent performance samples in calculating the performance counter values. For example, samples for the various performance counters being measured can be taken every five seconds on the server for a period of time. According to one embodiment, the health monitor continually measures the health of the server. According to another embodiment, the health is monitored on a different basis, such as being monitored at predetermined times according to a schedule and/or triggered on the occurrence of an event, such as receiving a request from the client, and the like. While traffic manager 26 is illustrated as an independent program, the functionality may be integrated into other software and/or hardware. The operation of traffic manager 26 is described in more detail below. User Interface 25 may be utilized to interact with traffic manager 26 and/or application programs 24.
During operation, clients 1-3 receive health information through IP network 18 from the server. This health information allows each client to be aware of the health of the server and schedule its requests to the server appropriately. The health information can include different information. For example, the health information could be a health score and/or could include other health information for the server including but not limited to, CPU usage, memory usage, and the like.
As discussed above, health monitor is configured to determine the health of the server. According to one embodiment, the health monitor determines the health of the server periodically (e.g. 1 second, five seconds, ten seconds, one minute, etc). This monitoring frequency may be fixed or variable as well as being changed by an authorized user. For instance, during certain periods the health of the server could be checked every 5 seconds during one period and every minute during another period.
For purposes of illustration, assume that application 1 on client 1 requests an action to be performed by server 240 (e.g. retrieving data, writing a value to a data store, etc.). After server 240 has performed the action, server 240 provides the most current health information calculated by the health monitor to client 1 along with the response to the client's request. Before a subsequent request, the client may utilize the health information to determine when to send a request to server 240. For example, during some periods the health of the server may be very good during which time the client may send requests as frequently as needed. During other times, the server's health may have degraded thereby causing the client to possibly slow the frequency of requests it sends to the server.
According to one embodiment, the health information that is provided to client is a health value that is between 0 and 1. Other ranges and/or other information may be provided. For example, the information could provide detailed information about the server's current resource use or the information could be somewhere between the single value and the detailed information. The health information may be a current snapshot of the health of the server or an average of the health of the server. According to one embodiment, an Exponential Moving Average (EMA) is utilized to track various performance counters on the server. The number of samples can be preset and/or configurable. For example, the number of samples could be 6, 12, 24, 100, and the like. An exemplary formula for calculating the EMA is EMA(current)=((Value(current)−EMA(prev))×Multiplier)+EMA(prev), where the Multiplier=2/(N+1). EMA is directed at smoothing out the spikes in the performance counter readings while still giving more weight to the recent performance samples. The higher value of the multiplier, the more weight is given to the recent values.
The example below shows a CPU trace for a program that consists of 24 samples and its EMA calculations. In the illustrated example, a 12-sample window (N=12) is used to calculate the average. Exp. N=12, the Multiplier=2/(12+1)=0.1538. In addition, the simple moving average (SMV) value of the first 12 samples is used to set up the exponential calculation.
Many different performance counters may be measured on the server. For example, performance counters may measure CPU usage, memory usage, wait time, queue length of requests, and the like.
After obtaining the determined performance counters, each performance counter is normalized and mapped to a range of values. According to one embodiment, each performance counter is mapped to one of ten different values. The mapping is used to normalize the way to measure the server load using different performance counters since performance counters generally have different units of measure. As a result, scores of different performance counters are obtained. For example: CPU Score (SCPU), Memory Score (SMemory), ASP.NET Queue Length Score(SQueue), and ASP.NET Wait Time in the Queue (SWaitTime).
The following table illustrates an exemplary mapping of values.
Other mappings may be implemented depending on the performance counters being measured. For example, different performance counters may require different curve fitting. Further, the values in the mapping table may be updated to further tune the performance and provide a more accurate health score.
After mapping the performance counters, a health of the server is determined. According to one embodiment, the maximum mapped value that is calculated for a performance counter is used as the server's health score. Other values may be used. For example, the mapped values may be averaged, different counters may be given more weight in calculating the score, and the like. Once the current health score of the server is determined it is provided to the client when the server sends back the response to the request made by the client.
Referring now to
Referring now to
After a start block, the process moves to operation 310, where the health of the server is monitored. As discussed above, the health of the server may be continually monitored such that a current health score for the server may be more quickly obtained or on some other basis. Different performance counters of the server may be monitored. For example, memory usage, CPU usage, wait time, processing time, and the like may be monitored.
Moving to block 320, the current health score for the server is determined. According to one embodiment, each averaged performance counter value is mapped to a normalized value such that the values are more easily compared. The health score may be computed many different ways. For example, the largest normalized value may be used as a health score, the normalized values may be averaged, more weight may be assigned to different performance counters and the like.
Transitioning to operation 330, the server processes a received request from a client. The request may relate to many different items, such as requesting a resource or performing some other action.
Flowing to operation 340, the health score is sent to the client along with the response to the request initiated by the client. According to one embodiment, the health score is placed within a header of the response message. The health score may also be placed in other locations within the response. For example, the response may be encoded within the body of the response.
The process then flows to an end block and returns to processing other actions.
After a start operation, the process flows to operation 410 where the Exponential Moving Average (EMA) for each monitored performance counter is determined The EMA is used to smooth out the spikes in the performance counter values while still giving more weight to the recent performance samples in calculating the performance counter values. A different number of samples may be utilized.
Moving to operation 420, each of the calculated EMA's is mapped to a normalized value. The mapping is used to normalize the way to measure the server health using different performance counters since performance counters generally have different units of measure. The table used for the mapping may be fixed and/or updated to further refine the mapping of the values.
Flowing to operation 430, the health score for the server is determined based on the mapped values. According to one embodiment, the health score is the largest value of the normalized performance counters.
The above specification, examples and data provide a complete description of the manufacture and use of the composition of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.