This application claims priority from European patent application 05300801.7, filed on Oct. 7, 2005. The entire content of the aforementioned application is incorporated herein by reference.
The present invention generally relates to methods, systems and computer program products for the real-time reporting of service level agreements, and for example, to methods, systems and computer program products for predicting whether a service level agreement will be complied with.
With IT infrastructures having emerged from a purely scientific environment into almost all companies, their economic aspects have continuously gained in importance over recent decades, so that nowadays numerous companies heavily rely on some sort of IT infrastructure (e.g. information servers, such as Web or database servers). Depending on the size and sophistication of a company, however, it may not be possible or practical to maintain the IT infrastructures in-house. Accordingly, some companies, such as electronic data processing centers (EDPC), offer servers and communication outsourcing services.
This development entails that nowadays IT infrastructures are not only an issue in computer science, but also in business administration where economic implications of IT infrastructures are researched. In this realm, attention is drawn, for instance, to the question of how a company (in this context a service provider) leasing out an IT infrastructure or services made available by means of an IT infrastructure may contractually assure to a customer that the IT infrastructure or the services leased out comply with conditions agreed upon in advance. These conditions usually concern the “quality-of-service” (QoS) which may refer to characteristics of the services themselves, such as availability, performance, reliability, transmission delay, bandwidth and up-time, but may also refer to the capability of the service provider to repair the IT infrastructure in the event of an outage. A set of contract conditions in relation to target compliances and sanctions concerning consequences in the case that the target compliances are not fulfilled are usually referred to as a service level agreement. In formal terms, a service level agreement (SLA) is a contract that formalizes a business relationship, or part of the relationship, between two parties. Most often, it takes the form of a negotiated contract made between a service provider and a customer and defines a price paid in exchange for an entitlement to a product or service to be delivered under certain terms, conditions, and with certain financial guarantees (cf: Lee, J. J., Ben-Natan, R., “Integrating Service Level Agreement”, Wiley Publishing Inc., 2002, p. 3).
The TeleManagement Forum's SLA Management Handbook defines an SLA as “[a] formal negotiated agreement between two parties, sometimes called a service level guarantee. Typically, it is a contract (or part of one) that exists between the service provider and the customer, designed to create a common understanding about services, priorities, responsibilities, etc.”
Historically, service level agreements arose in the early 1990s as a way for measuring and managing quality of service (QoS) that IT departments and service providers within private (usually corporate) computer networking environments delivered to their internal customers. It is foreseeable that the use of service level agreements will soon become the prevailing business model for delivering a large number of services. Service level agreements offer service providers the ability to distinguish themselves from competitors in today's volatile markets while providing a measure of security for their customers.
During the evaluation period of a service level agreement, a service provider may be interested in the likelihood that the conditions agreed upon with the customer will be complied with at the end of the evaluation period of an SLA. To this end, the service provider may be endowed with a prediction unit as part of a service level reporting unit which informs the service provider whether or not the service level agreement will be complied with. If the prediction unit notifies the service provider that the service level agreement will prospectively not be complied with, the service provider may preemptively react to this notification by allocating more reliable network resources, such as servers etc., in order to avoid any breaching of the service level agreement.
WO 02/42923 discloses a method, system, and computer program product for monitoring services (e.g., communications services and information server services) for compliance with a specified set of target criteria (e.g., as specified in a contract). The document also discloses a monitoring computer system including a prediction engine that uses large quantities of data that are gathered by measurements agents. With reference to historic data, the prediction engine analyzes whether current problems are indicators of future problems.
U.S. Pat. No. 6,556,659 discloses a service level management system which includes a proactive threshold manager that alerts service providers to a risk that a certain level of service is in danger of being breached. The proactive threshold manager provides an indication or alarm if the current level of service is within a predetermined range regarding the minimum service level which needs to be provided to subscribers. The alert is given in due time so that the provider has enough time to remedy the problem before a service level agreement is breached. The service level management system also includes a data-mining unit that provides the capability to analyze network management data looking for patterns and correlations across multiple dimensions. Thereby, models of data behavior are constructed in order to predict future growth or problems and facilitate a proactive management of the network.
U.S. Pat. No. 6,801,945 discloses systems and methods for the prediction of visitor traffic to a network of web site pages. The system also considers annual seasonality, day-of-week, holidays, special events, short histories, user demographics, user web behavior (viewing, listening and transacting) and parent and child web page characteristics.
US 2002/0152305 discloses a method of analyzing resource utilization information. The method is based on historical tracking of system performance parameters, such as resource availability and/or usage, adherence to provisioned SLA policies, content usage patterns, time-of-day access patterns, etc. Furthermore, a data analysis module is disclosed which is capable of predictive analysis, such as resource-utilization forecasting, processing engine requirement projections. A short term forecast algorithm is disclosed which is capable of predicting system workload for any desired selected unit of time based on historical resource utilization load on the system and/or given processing engine(s). Moreover, a long-term trend algorithm is mentioned which is capable of predicting an overall trend line and growth pattern for system workload and/or workload of a given processing engine.
A method is provided of predicting a degree of service-quality compliance in an IT infrastructure. The method is carried out at a current point of time within an evaluation period before the end of the evaluation period, wherein service-quality compliance means that a service-quality parameter of the IT infrastructure complies with a service-quality objective. A statistic is obtained which indicates probabilities that the service-quality parameter will comply with the service-quality objective in sub-periods of the future part of the evaluation period. The statistic is based on known frequencies in equivalent sub-periods in the past. A calculation on the basis of this statistic indicates an estimated duration in which the service-quality objective will be complied with during the future part of the evaluation period.
According to another aspect, a method is provided of predicting a degree of service-availability compliance in an IT infrastructure. The method is carried out at a current point of time within an evaluation period before the end of the evaluation period, wherein service-availability compliance means that a service of the IT infrastructure is available. A statistic is obtained which indicates probabilities that a service will be available in sub-periods of the future part of the evaluation period. The statistic is based on known frequencies in equivalent sub-periods in the past. A calculation on the basis of this statistic indicates an estimated duration in which the service will be available during the future part of the evaluation period.
According to another aspect, a computer system is provided for predicting a degree of service-quality compliance in an IT infrastructure at a current point of time within an evaluation period before the end of the evaluation period, wherein service-quality compliance means that a service-quality parameter of the IT infrastructure complies with a service-quality objective. The computer system is programmed to obtain a statistic indicating probabilities that the service-quality parameter will comply with the service-quality objective in sub-periods of the future part of the evaluation period. The statistic is based on known compliance frequencies in equivalent sub-periods in the past, and to calculate, on the basis of the statistic, an estimated duration in which the service-quality objective will be complied with during the future part of the evaluation period.
According to another aspect, a computer program product is provided which is either in the form of a machine-readable medium with program code stored on it, or in the form of a propagated signal comprising a representation of program code. The program code is arranged to carry out a method, when executed on a computer system, of predicting a degree of service-quality compliance in an IT infrastructure at a current point of time within an evaluation period before the end of the evaluation period, wherein service-quality compliance means that a service-quality parameter of the IT infrastructure complies with a service-quality objective. A statistic is obtained which indicates probabilities that the service-quality parameter will comply with the service-quality objective in sub-periods of the future part of the evaluation period. The statistic is based on known frequencies in equivalent sub-periods in the past. A calculation on the basis of this statistic indicates an estimated duration in which the service-quality objective will be complied with during the future part of the evaluation period.
Other features are inherent in the methods and products disclosed or will become apparent to those skilled in the art from the following detailed description of embodiments and its accompanying drawings.
Embodiments of the invention will now be described, by way of example, and with reference to the accompanying drawings, in which:
a illustrates a service level objective relating to several metrics of different network resources;
b illustrates a tree representation of the service-quality condition of the service level objective of
a shows a flowchart indicating the course of process of calculating an estimation of an availability percentage for the end of the second evaluation period, and calculating a violation period in the future part of the evaluation period, during which a point of time occurs from that onward the service level objective is violated;
b shows a flowchart illustrating the course of action of collecting metric values, evaluating a service-quality condition tree and of updating the statistic;
The drawings and the description of the drawings are of embodiments of the invention and not of the invention itself.
In some of the embodiments, a degree of service-quality compliance in an IT infrastructure is predicted. The prediction is made at a current point of time within an evaluation period before the end of the evaluation period. As will be discussed in more detail below, service-quality compliance means that a service-quality parameter of the IT infrastructure complies with a service-quality objective. To perform the prediction, a statistic indicating probabilities that the service-quality parameter will comply with the service-quality objective in sub-periods of the future part of the evaluation period is obtained. In some of the embodiments, the statistic is based on known compliance frequencies in equivalent sub-periods in the past. An equivalent sub-period is, for example, the same day of a week, or the same hourly interval during a day. On the basis of the statistic, an estimated duration is calculated in which the service-quality objective will be complied with during the future part of the evaluation period.
It should be mentioned that the term “IT infrastructure” as used herein refers to both computer networks and telecommunication networks.
A service level agreement (SLA) is a contract, in which a customer wishing to use a service, typically based on network resources, and a provider supplying the desired service agree upon the service itself, performance levels, responsibilities and modalities, such as the time period during which the provider makes available the service. The term “service”, as used herein, may refer to either providing one or more network resources as hardware entities or providing hardware entities on which application programs are installed, which the customer is entitled to access. Performance levels indicate the availability of the service the customer and the provider have agreed upon. In general, executing an SLA contractually sets the customer's expectations regarding a product's delivery. Once defined, agreed to, and executed, the terms and conditions that make up the bulk of the SLA contract become the customer's entitlements with respect to the service. This warranty enables the customer to plan and operate his or her business with a reasonable level of confidence in the availability, performance, or timeframe of a contracted service (cf., for example, J. Lee et al., “Integrating Service Level Agreements”, p. 8, Wiley Publishing, 2002).
Typically, a customer may choose among different service level options, which are frequently referred to as platinum, gold, silver, bronze, etc. each of them guaranteeing a different service level—with platinum as the highest service level option. Thereby, a customer is able to select a service level option corresponding to his/her requirements, and different service level options may be agreed upon for different time periods. For instance, a customer leasing an IT infrastructure for an online shop, decides for a platinum service level option during the day and a silver service level option during the night since most purchases are made during the day.
Service level agreements also constitute an endorsement for the service provider since s/he is well aware of a customer's expectations and may therefore better attune to them. The provider is able to plan his/her IT infrastructure according to the conditions to which s/he has committed himself/herself in the service level agreement.
A service level agreement typically relies on metrics relating to network resources of an IT infrastructure. There are two main types or classifications for SLA metrics. The first type measures the quantity, quality, availability, and level of service delivered by the IT infrastructure. The measurement is based on the ability of the service provider to compile statistics from the network elements themselves using automated reporting generated from a network management function. These measurements are sometimes referred to as infrastructure metrics. Infrastructure metrics may include the following: available capacity, available throughput, discarded packets, discarded frames, access time, resource availability, resource utilization, etc. The second type of metrics measures the provider's ability to provide resources to deploy, operate, and maintain the services at the level contracted for. The primary focus of this type of metrics is to measure the performance of the service provider's operations infrastructure (technical support) relative to activities that affect the ability of the network to deliver the services. These are sometimes referred to as infrastructure independent metrics and include the following: mean time between failures (MTBF), mean time to provision (MTTP), mean time to repair (MTTR), etc.
A typical service level agreement includes, besides a description of the service itself (what is provided, during which time, to which customer, etc.) and the penalties in the event of non-compliance, a definition indicating which objective the service has to meet; for example, if the service level agreement refers to service availability, this will be a definition of when the service is assumed available. Such a definition is referred to as service-quality condition, if it refers to a point of time. If an evaluation period is considered, within which the service-quality condition has to be fulfilled during a certain duration, typically indicated as a percentage value (target service level objective compliance), then the term “service level objective” (SLO) is used. A service-quality condition is preferably represented in the form of a tree and basically represents a condition involving one or more metric values. A service-quality condition is a service-quality parameter of one or more resources of the IT infrastructure in comparison to a service-quality objective. The evaluation of a service-quality condition yields the compliance of a service-quality parameter, i.e. a True/False (or 1/0) answer, whether the service-quality parameter is above or below a service-quality objective. A service-quality parameter is associated with a node of the second-highest level of a service-quality condition tree, i.e. before the comparison with a service-quality objective. In some of the embodiments, the service-quality parameter is elementary in that it refers to only one metric being compared with a service-quality objective, whereas in other embodiments, the service-quality parameter is a composite service-quality parameter referring to the evaluation of a complex condition comprising several metrics. Furthermore, a service level objective is put in relation to a target SLO compliance, which is typically a percentage value, indicating which percentage portion of the entire evaluation period of an SLO the service-quality condition has to be complied with, so that the SLO is complied with.
In an SLA environment, a service provider normally wishes to receive reports about values pertaining to metrics, or generally about a service-quality parameter, on a nearly real-time basis. As mentioned above, in some of the embodiments, the service-quality parameter is elementary in that it refers to a single metric being compared with a service-quality objective. In other embodiments, the service-quality parameter is a composite service-quality parameter referring to the evaluation of a complex combination of several elementary metrics. Metric values are transmitted from metric adapters to a metric collector at a central SLA reporting station. However, from a customer's perspective, a relevant point of measurement may be the one that is contractually defined in the SLA as the service access point (SAP). Therefore, in some of the embodiments metric adapters are not only used for the network devices of the IT infrastructure leased out by the service provider, but are also provided at the SAPs. An SAP is the physical termination point (or device) where the service provider's responsibilities end and those of the customer begin. Delivery of the service to the SAPs is usually the customer's only concern within the entire network. Thus, in some of the embodiments, products and services delivered under SLAs are measurable at the SAPs.
SLAs are intended to guarantee the service provider's performance at a predefined quality-of-service (QoS) level at a designated service access point (SAP). QoS is defined by the International Telecommunications Union (ITU-T) as “the collective effect of service performances, which determine the degree of satisfaction of a user of the service. The quality of service is characterized by the combined aspects of service support performance, service operability performance, service integrity and other factors specific to each service.” To ensure performance, service provider performance at the SAPs is tied to a set of financial penalties. The intent is to penalize non-compliance in order to provide motivation for service providers to deliver SLA-compliant performance. Quality of service has become the standard by which service providers are judged. The focus of QoS has shifted away from the service provider's point of view towards the network technology and instead is homing in on the impact of availability on the customer's business. The financial models of SLAs have not kept pace with this evolution. Pricing can be expected to evolve from the current provider-focused penalty-formulation methodology to one that is much more aligned to the business impact experienced by the customer.
Exemplarily, according to a provider-centric methodology, a penalty, which is 1.00% of the invoiced service charge for the affected customer of the service for a given month, is credited to a customer for each 0.10 percent below the performance requirement. However, this penalty does not refer to the business losses that are entailed by non-compliance with the performance requirement agreed upon.
Therefore, the intent of a business-impact approach is to mitigate the business risks associated with total dependence on the telecom service provider, which is obviously much more closely aligned to the true intent of SLAs as used by customers today. For instance, a customer having an online shop is interested that its website is available in 99.98% of the evaluation period of an SLA within 3 seconds for a potential purchaser accessing the website and wants this condition to be incorporated into the SLA. If the service provider fails to guarantee that condition, the customer desires a monetary compensation according to his/her business losses. However, the business losses that occurred as a direct result of the non-compliance with the SLA cannot be measured objectively, so that the business-impact approach is still uncommon. What will most likely evolve is the use of different types of historical data and statistical averages for sales transactions to compare the period of SLA non-compliance with a comparable period. Currently, service providers are still reluctant to accept business impact penalty pricing.
In order to avoid SLA violations, a provider is interested in being informed about possible SLA violations before they actually occur. To this end, predicting whether an SLA is likely to be breached may typically be performed by extrapolating at a current point of time the current compliance percentage to the entire time interval of the SLA in order to estimate whether the SLA is likely to be complied with. For instance, if the total evaluation period of an SLA is 10 days, and during the 8 days that have elapsed, the service has been unavailable during 1 hour (current compliance percentage 99.58%), then it will probably be unavailable during 1.25 hours during the total evaluation period, yielding an estimated compliance percentage of 99.48%, on the supposition that the availability of the service in the future will be the same as it has been during the elapsed part of the evaluation period. (It should be mentioned that compliance percentages are always indicated with regard to the entire evaluation period of the SLA.) If a target SLO compliance of 99.50% has been agreed upon in the SLA, then the SLA is likely to be breached. However, it could be the case that the two remaining days fall on a weekend, so that there might still be a chance that the SLA may be complied with (for instance, if the SLA refers to an access time which is typically smaller during weekends since fewer people access the network resource). The results of this way of predicting compliance of SLAs are better, the closer the time is to the end of the evaluation period of the SLA.
In some of the embodiments, a degree of service-quality compliance in an IT infrastructure is predicted at a current point of time within an evaluation period before the end of the evaluation period. Service-quality compliance means that a service-quality parameter of the IT infrastructure complies with a service-quality objective. A statistic is obtained which indicates probabilities that the service-quality parameter complies with the service-quality objective in sub-periods of the future part of the evaluation period. The statistic is based on known compliance frequencies in equivalent sub-periods in the past. On the basis of the statistic, an estimated duration is calculated in which the service-quality objective will be complied with during the future part of the evaluation period. In some of the embodiments, a service-quality condition is a metric of a resource of the IT infrastructure in comparison to a threshold which is used to determine whether a service-quality parameter (in this case the metric) complies with the service-quality objective. An example of a elementary service-quality condition is “database_access_time<0.3 sec.”. In other embodiments, a service-quality condition involves several metrics which are evaluated to one composite service-quality parameter. To calculate the estimated duration during which the service-quality parameter will comply with a service-quality objective, a statistic is obtained, which indicates probabilities that the service-quality parameter complies with the service-quality objective in sub-periods of the future part of the evaluation period. In some of the embodiments, the statistic is based on known frequencies in equivalent sub-periods in the past. An equivalent sub-period is, for example, the same day of a week or the same hour(s) of a day. The recurring time interval may be a day, a week, a month, a year according to a cyclical behavior of the service-quality parameter and is subdivided into smaller sub-periods. For each sub-period, it is indicated whether the service-quality condition is complied with in this sub-period. On the basis of the statistic, the estimated duration in which the service-quality condition is complied with during the future part of an evaluation period is calculated.
In other embodiments, the calculated estimated duration is used to calculate an estimated SLO compliance for the end of the evaluation period at a current point of time before the end of an evaluation period. A service level objective is defined as a service-quality condition in comparison to a target SLO compliance, which is typically indicated as a percentage value. An example of a service level objective is “(database_access_time<0.3 sec)>99.98%”. This means that the service-quality condition needs to be fulfilled in 99.98% of the service-quality parameter values obtained (which may be obtained for example each second or minute during the evaluation period) is complied with. In the given example, the target SLO compliance is 99.98%. To calculate the estimated SLO compliance, the elapsed part of the evaluation period, which is the time from the beginning of the evaluation period to the current point of time, is also considered. The duration is measured, during which the service-quality condition has been complied with during the elapsed part, and is added to the estimated duration in which the service-quality condition will be complied with during the future part of the evaluation period. This sum is used to calculate an estimated SLO compliance for the end of the time interval.
In other embodiments, the service-quality parameter refers to availability of a service which may be the availability of a network resource, such as an up/down metric of a network device, whereas in other embodiments, availability refers to a service which includes metrics of several network resources. In the context of availability, the service-quality objective, to which the service-quality parameter is compared, is “1” indicating that the service needs to be available to comply with the service-quality objective. The service-quality parameter either adopts the value “0” if the service is unavailable and adopts the value “1” if the service is available.
In some of the embodiments, a degree of service availability compliance is predicted in an IT infrastructure, wherein service availability is complied with, if the service is available. At a current point of time within an evaluation period before the end of the evaluation period, a statistic is obtained which indicates probabilities that a service will be available in sub-periods of the future part of the evaluation period. The statistic is based on known availability frequencies in equivalent sub-periods in the past. A calculation on the basis of the statistic indicates an estimated duration in which the service will be available during the future part of the evaluation period.
In some of the embodiments, a recurring time interval is determined by analyzing the cyclic pattern of the service-quality parameter. When analyzing access time of a database server, it may be ascertained, for example, that the access time is approximately the same on every Monday, Tuesday, etc. and that the access time is significantly shorter at weekends. Then, it is adequate to assume that a week is the recurring time interval which is subdivided into sub-periods, such as a day. In other embodiments, it is adequate to assume that a day is a recurring time interval, since each day shows approximately the same behavior of a service-quality parameter. A day can, for example, be further subdivided into shorter sub-periods, such as one- or two-hour intervals. All service-quality parameter values showing the same behavior may therefore be summarized into equivalent sub-periods, i.e. the service-quality parameter values of all Mondays, all Tuesdays, all Wednesdays, etc. are associated to one sub-period, respectively. In some of the embodiments, all workdays are considered as one equivalent sub-period of the statistic, and all holidays are considered as another equivalent sub-period.
In some of the embodiments, cyclic patterns are determined by means of mathematical analysis of existing samples, such as Fast Fourier Transformation or Wavelets.
In some of the embodiments, the recurring time interval is determined on the basis of experience values of an IT infrastructure operator who knows or who may estimate cyclic patterns of service-quality parameters.
In other embodiments, the recurring time interval is arbitrarily chosen based on the evaluation period. For instance, in the case of weekly evaluation periods, a statistic is computed for every hour of the week to cover cyclic patterns based on the hours of a day and the days of a week. For monthly and quarterly evaluation periods, a statistic is computed for every day of the week.
In some of the embodiments, more than one statistic is calculated, for instance, one statistic for the days of a week and one statistic for the days of a quarter. Values from both statistics are then combined to estimate predictive compliance.
In other embodiments, if the estimated SLO compliance is insufficient with regard to the target SLO compliance, an estimated violation interval is calculated, during which a violation of the target SLO compliance occurs for the first time (violation point). This is done by calculating estimated SLO compliances for some points of time in the future part, starting with the points of time closest to the current point of time, and finding the first point of time, at which the target SLO compliance is violated (point of time P2). The violation interval is then the interval starting at a point of time at which the target SLO compliance is still complied with (point of time P1) and ends with P2. In some of the embodiments, the points of time P1 and P2 lie sufficiently close together, so that the violation interval may then be considered as a violation point. The determination of a violation point may be performed by narrowing down the time interval between P1 and P2.
In some of the embodiments, the user is alerted by an audio and/or visual signal, if the calculation shows that the estimated SLO compliance is insufficient for the target SLO compliance. Moreover, the user is informed about the violation interval. This allows the user to take preemptive measures, such as upgrading network resources, in order to avert a violation of the SLA. If the violation cannot be averted, since the violation point is in the very near future, the violation may at least be mitigated which may reduce the contract penalty stipulated in the SLA.
In some of the embodiments, the statistic is permanently updated in response to the receipt of new metric values or calculations of the values of service-quality parameters. For instance, if the statistic says that the probability that (database_access_time<0.3 seconds) is 100% and metric values delivered indicate, that the database_access_time is currently 0.5 seconds, then the statistic is immediately corrected downwardly.
Service-quality parameters values, being based upon metric values, are collected over a long period of time, and are represented in the statistic with reference to a shorter, recurring time interval. This is due to the fact that, in many cases, reasonable estimations concerning compliance of service level objective can only be made if the service-quality parameter values measured are subject to a cyclic pattern. It is often the case, that the access time is subject to a weekly cycle, which means that the access time every Tuesday is approximately the same and it is probably longer than the access time on Sundays since fewer people access the database on Sundays than on Tuesdays. Therefore, in order to obtain the statistic includes determining the cyclic pattern of the service-quality parameter, and this is then used as the recurring time interval. If one wishes, for example, to estimate a service-quality condition compliance for a future part of an evaluation period which includes a national holiday during the week (i.e. not at the weekend), then it may be advisable to use a statistic based upon service-quality parameter values derived from weeks with a holiday on the same day. The use of a statistic indicating compliance probabilities for weeks without a holiday would distort the estimation.
In some of the embodiments, the estimated duration is calculated by means of the expectancy value, which is defined as the sum of probabilities that the service-quality condition is complied with multiplied by the lengths of the sub-periods of the future part of the evaluation period.
In some of the embodiments, the estimated duration in which the service-quality condition is complied with is re-calculated periodically in order to increase its accuracy. As time progresses, the future part of the evaluation period gets shorter. Consequently, the calculation of the estimated duration in which a service-quality condition is complied with during the entire evaluation period becomes less probabilistic and increasingly based on factual measurements.
In some of the embodiments, the statistic is obtained from a service level management reporting datamart storing historical data from the individual network resources.
Some of the embodiments of the computer program product with program code for performing the described methods include any machine-readable medium that is capable of storing or encoding the program code. The term “machine-readable medium” shall accordingly be taken to include, for example, solid state memories and, removable and non removable, optical and magnetic storage media. In other embodiments, the computer program product is in the form of a propagated signal comprising a representation of the program code, which is increasingly becoming the usual way to distribute software. The signal is, for example, carried on an electromagnetic wave, e.g. transmitted over a copper cable or through the air, or a light wave transmitted through an optical fiber. The program code may be machine code or another code which can be converted into machine code, such as source code in a multi-purpose programming language, e.g. C, C++, Java, C#, etc. The embodiments of a computer system may be commercially available general-purpose computers programmed with the program code.
Returning now to
In
Alternatively, a statistic, like the one of
It should be mentioned that the statistic of
In
The calculation of the estimation is elucidated in
a shows another example of the service level objective 10 of
b shows an equivalent tree representation of the service-quality condition of
The subject-matter described below in connection with FIGS. 8 to 11 mainly corresponds to that already described in FIGS. 3 to 6, but now using the composite service-quality parameter of
In
a shows a flowchart indicating the course of process of estimating at a current point of time the availability percentage at the end of the time interval. At 20, a current statistic is obtained which is based on a data pool, also including very recent data. The statistic is obtained by calculating frequencies of service quality compliance in the past in equivalent sub-periods and considering these frequencies as probabilities. At 21, an estimated availability percentage is calculated for the end of the evaluation period of the service level agreement at the current point of time. At 22, it is ascertained, on the basis of the estimated availability percentage, whether the service level objective is likely to be violated at the end of the evaluation period. If the calculation yields that it is likely to be breached, then, at 23, the service provider is alerted so that s/he may proactively take countermeasures to avert the SLO violation and to thereby obviate the payment of contract penalties. There are several ways to indicate SLA compliance prediction to the user. The user may be provided with the estimated SLO compliance percentage (which the user may compare himself/herself with the target SLO compliance), with the indication whether the target SLO compliance will be complied with or not, with an indication concerning the probability that the SLO will be complied with or not at the end of the evaluation period, with an indication about the estimated time of violation and/or with the difference between the estimated SLO compliance and the target SLO compliance. All these indications are different representations of the information provided to the user with regard to a prediction of SLA compliance. In addition, at 24, the estimated violation interval is calculated. At 25, this information is also indicated to the service provider, so that s/he knows how much time is left before the SLO is violated. The information may be regarded as an indication of urgency informing the service provider about the time remaining before the probable SLO violation. If, however, the SLO will not be violated in accordance with the calculation at 21, then it is ascertained at 26 whether the end of the evaluation period has been reached. If so, then the procedure is finished. If, however, the evaluation period has not yet finished, then, at 26, the current point of time is moved forward (to indicate the progress of time) by At. Then, in order to close the loop, at 21 a re-calculation is performed at the new current point of time.
b illustrates the way new incoming metric values are handled. At 28, the metric values are received from the metric adapters at a point of time t. They are inserted into a service-quality condition tree, which is evaluated at 29 to establish whether the service is available at the point of time t. At 30, the metric values are also used to update the statistic, so that the statistic always includes the most recent metric values. At 31, it is ascertained whether the end of the evaluation period is reached. If so, then the procedure stops. Otherwise, at 32, the point of time is moved forward by Δτ, at which again new metric values are received by the metric collector 2 from the metric adapters 9.
Thus, the embodiments of the invention described above allow for a more precise SLA compliance prediction by taking into account cyclic variations, such as workdays in contrast to weekends, of a service-quality parameter.
All publications and existing systems mentioned in this specification are herein incorporated by reference.
Although certain methods and products constructed in accordance with the teachings of the invention have been described herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all embodiments of the teachings of the invention fairly falling within the scope of the appended claims either literally or under the doctrine of equivalents.
Number | Date | Country | Kind |
---|---|---|---|
05300801.7 | Oct 2005 | EP | regional |