The present disclosure relates to Quality of Service (QoS). More specifically, this disclosure relates to a method and system for optimizing QoS based on a model of perceived QoS and of a QoS metric.
A service provider may typically provide operational resource allocations, or inputs (such as provided computer processing, data, information and communication services, transportation, hospitality or other services, catered food, etc.) in order to deliver services to a customer, whether a consumer, a business, or another intermediary organization. In providing such allocations, the provider may face constraints on resources or costs of resources that fluctuate in time. For example, input costs could include electricity costs that fluctuate during the course of a day, gasoline or other fuels, human labor, raw ingredients or materials, research and development costs, or property and equipment depreciation. In response to such fluctuating costs or supply, as well as to fluctuating demands from customers, a provider may wish to optimize the amount and timing of allocations the provider provides. Typically the provider may do so by varying quantity or quality of allocations in response to fluctuating costs, while still satisfying an agreed-upon Service Level Agreement (SLA) with the customer specifying an acceptable range of values for various quality of service (QoS) metrics.
However, simply providing sufficient allocations to satisfy the SLA may leave customers with a poor impression. In particular, an SLA may often stipulate a minimum or range of acceptable values of a QoS metric, but simply satisfying the minimum acceptable level of QoS may do little to convince a customer of good provided value. Optimizing QoS by minimizing costs and merely satisfying the SLA runs a danger of setting too low a QoS target, whereas exceeding the SLA could produce inefficient allocation of resources.
In many cases, customers' impressions of QoS are formed instinctively rather than logically or based on objective metrics such as those in an SLA. Behavioral economics research shows that perception of the quality of an experience is not always rational, and in fact is often based on heuristics. As described in some of Kahneman's groundbreaking work (see, for example, Thinking Fast and Slow by Daniel Kahneman, published by Farrar, Straus and Giroux, 2011, hereby incorporated by reference in the present application), people can be described as having two modes of mental operation, System 1 and System 2. System 1 is intuitive and very quick to operate and judge a situation, but tends to be heuristic-based, and therefore not strictly rational. System 2 is more rational, and is called into play when a situation demands more in-depth analysis, when System 1 is unsure, or when a person focuses on the details of a problem. In evaluating QoS, an individual's System 1 would give an impression of how the service was performing, whereas System 2 would analyze the service's performance, e.g. with respect to metrics such as in an SLA. To truly satisfy the customer's expectations, the service must satisfy both System 1 and System 2, as the customer may operate in either mode at any given time.
Behavioral economics has also shown that many irrationalities in decision making are systematic and can be anticipated. For example, the framing effect refers to people's tendency to treat the same situation or choices differently depending on whether something is perceived as a loss or a gain. This finding matches research in customer expectations, which shows that both the raw performance of a product and the difference between performance and expected performance play a role in customer perceptions. Likewise, prospect theory implies that dips below expectations have more negative effects on perceived QoS than peaks above expectations have positive, and there seems to be an absolute minimum performance threshold expected for acceptable service. Customer expectations can also change over time, e.g. by adapting to experienced levels of service. For example, the worst and most recent performance experienced are often the most salient (the so-called peak-end rule).
One embodiment of the present invention provides a system and method for optimizing Quality of Service (QoS) based on a model of perceived QoS and of a QoS metric. During operation, the system obtains a Service Level Agreement (SLA) between a provider and a customer, wherein the SLA specifies a QoS metric and a corresponding range. The system then optimizes the QoS metric and a perceived QoS of the customer. This further comprises modeling the customer's perceived QoS based on an operational allocation of the provider and modeling the QoS metric, based on the operational allocation of the provider. Optimizing the perceived QoS may further comprise determining an optimized level of the provider's operational allocation or input that enhances the customer's modeled perceived QoS and the modeled QoS metric. The system may then set the provider's operational allocation to the optimized level.
In a variation on this embodiment, determining the optimized level of the provider's operational allocation may involve constraining the provider's operational allocation to correspond to a modeled QoS metric within the range specified by the SLA.
In a variation on this embodiment, determining the optimized level of the provider's operational allocation involves optimizing the modeled QoS metric via a stochastic constraint or a contribution to an objective function.
In a variation on this embodiment, modeling the customer's perceived QoS may involve basing the perceived QoS on a worst value of the QoS metric over a recent time interval.
In a variation on this embodiment, modeling the customer's perceived QoS involves basing the perceived QoS on one or more of: an average value of the QoS metric over a recent time interval; and a comparison of a recent value of the QoS metric to the average value.
In a variation on this embodiment, modeling the customer's perceived QoS involves basing the perceived QoS on one or more of: a comparison between a recent value of the QoS metric and the range; and the comparison, weighted such that the recent value of the QoS metric falling below the range is weighted more strongly than the recent value falling above the range.
In a variation on this embodiment, determining the optimized level of the provider's operational allocation further comprises one or more of: determining that providing the optimized level of operational allocation will increase a cost associated with the operational allocation by less than a predetermined threshold or proportion; and determining the cost and the enhanced perceived QoS as a function of the optimized level of the operational allocation.
In a variation on this embodiment, determining the optimized level of the provider's operational allocation involves optimizing perceived QoS via one or more of: a constraint; a stochastic constraint; and a contribution to an objective function.
In a variation on this embodiment, the range for the QoS metric includes one or more of: a minimum value of the provider's operational allocation; a deadline for a task completion; and a number of incomplete tasks.
In the figures, like reference numerals refer to the same figure elements.
The following description is presented to enable any person skilled in the art to make and use the embodiments, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Embodiments of the present invention solve the problem of optimizing Quality of Service (QoS) by modeling perceived QoS. The system may model a customer's perception of QoS as a function of a provider's operational allocations or inputs, together with outputs such as an agreed-upon QoS metric. The system can improve on previous systems by optimizing QoS, taking into account this modeled perceived QoS. Specifically, the system may optimize perceived QoS, subject to simultaneously satisfying an agreed-upon Service Level Agreement (SLA), which may specify an acceptable range of QoS metric values. In some embodiments, the system may optimize both perceived QoS and a modeled QoS metric. During operation, the system obtains an SLA between a provider and a customer, specifying a QoS metric and a corresponding range. The system then optimizes a perceived QoS of the customer, based on the range, which further comprises modeling the customer's perceived QoS and the QoS metric, based on an operational allocation of the provider. Optimizing the perceived QoS based on the range may further comprise determining an optimized level of the provider's operational allocations that enhances the customer's modeled perceived QoS, and also corresponds to a modeled QoS metric within the range. The system may then set the provider's operational allocations to the optimized level.
However, simply providing sufficient allocations to satisfy the SLA may leave customers with a poor impression. In particular, an SLA may often stipulate a minimum or range of acceptable values of a QoS metric, but simply satisfying the minimum acceptable level of QoS may do little to convince a customer of good provided value. Optimizing QoS by minimizing costs and merely satisfying the SLA runs a danger of setting too low a QoS target, whereas exceeding the SLA could produce inefficient allocation of resources. The disclosed system can improve on previous systems that optimize QoS, by taking into account a model of perceived QoS. This can help conserve resources by optimizing the level of provided operational allocations in a way that satisfies the SLA, and leaves the customer satisfied with QoS.
On the other hand, an optimized solution taking into account a simple model of perceived QoS is shown in
To model this effect, the system can subtract the minimum operational allocation, as an offset (or penalty term) to the cost function. The resulting optimized solution still lowers operational allocation during the more expensive hours, but not to zero. In optimizing, the system can search through the space of possible values of xm the lowest allocation, to find a value that best balances cost with perceived QoS. As shown, the system then directs the data center to operate at full capacity during the cheapest hours, and at xm during more expensive hours, until fulfilling the SLA minimum for operation hours. Such a solution optimizes the model of perceived QoS, together with costs, and subject to the constraint of satisfying the SLA.
As this example illustrates, the desired perceived QoS is determined based on a tradeoff between the system's predicted ability to achieve a particular level of service, costs associated with that service, and benefits in terms of customer satisfaction.
During operation, the system obtains QoS requirements or target perceived QoS 202. The system may then model perceived QoS (operation 204). Based on modeling 204, and taking into account desired metric QoS, prices, and loads 206, the system may generate a target QoS 208. The system may then optimize provider resource allocations (operation 210) to obtain optimized provider allocation level 212. For example, the provider may provide data center operation, transportation such as buses or trucks, or hotel operation, and the system could optimize the level of electrical or other power purchased, scheduling of transportation services, workload distribution, etc. Based on optimized provider allocation level 212, the system may then set the provider's operational allocations to the optimized level during the provider's operations 214. Thus, the disclosed system can improve operations 214 compared to previous systems by optimizing the level of provided operational allocations subject to the SLA, conserving resources, and leaving the customer more satisfied with QoS than previous systems. Based on operations 214, the system may achieve the guideline for metric QoS 216 stipulated by the SLA and/or desired metric QoS 206, thereby feeding back into overall process flow 200 for future optimization.
QoS optimization system 300 may include a perceived QoS module 302 installed on a storage device 304 coupled to a server 306. Note that various implementations of the present invention may include any number of computers, servers, and storage devices. In various implementations, perceived QoS module 302 may include a QoS optimizing module or other components of QoS optimization system 300 to perform the techniques described herein. System 300 may receive data describing a QoS metric and/or models, and store such data in storage device 304. System 300 may read the code for perceived QoS module 302 and the data for degradation measurements and features 308 from storage device 304. System 300 may divide a metric and/or models, and assign them to processors, such as processors 310A-310H, which operate on the assigned metric and/or models.
The system may then optimize a perceived QoS of the customer, based on the range for the QoS metric. This may further comprise modeling the customer's perceived QoS based on an operational input by the provider (operation 404) and modeling the QoS metric based on the operational allocations by the provider (operation 406). Specifically, in some embodiments, the system may model perceived QoS and the QoS metric as functions of the provider's operational allocations. In various embodiments, these modeling functions may be numerical or analytic, and may be designed by experts, based on known research findings, or may be learned automatically by the system, using various techniques such as machine learning that are known in the art. In some embodiments, the system may optimize multiple metrics and/or provider operational allocations, and the modeling functions may be simultaneous functions of multiple provider allocations and/or other variables or parameters.
Note that although the perceived QoS is generally more subjective than the QoS metric, the perceived QoS may nevertheless also be measurable and well-defined. For example, perceived QoS may be quantified by a survey using scientific scales such as a Likert or rating scale, or measured by other methods such as focus groups. Then the goal of the perceived QoS model in operation 404 may be to predict the outcomes of such perceived QoS measurements.
In some embodiments, the system may model perceived QoS in operation 404 based on various statistics, relationships, or heuristics involving the QoS metric and/or other variables. For example, the system may model perceived QoS based on an average of previous performance (i.e., the QoS metric), possibly including a model of forgetting (e.g., an average over a recent time interval, or weighted by a decaying factor over time). The system may model perceived QoS based on differences between delivered metric QoS and the SLA-specified range, possibly with dips below the target weighted more strongly than peaks above. The system may model perceived QoS based on the most recent performance as compared to the average. In some embodiments, the system may model perceived QoS based on the worst performance ever experienced (possibly with a decay factor or forgetting model as well). In some embodiments, the system may model perceived QoS based on an absolute minimum performance threshold.
Optimizing the perceived QoS based on the range may further comprise determining an optimized level of the provider's operational allocations that enhances the customer's modeled perceived QoS, and also corresponds to a modeled QoS metric within the range (operation 408). In some embodiments, the system may carry out optimization 408 within the models resulting from operations 404 and 406. In some embodiments, because satisfying the QoS or ‘hard’ metric is mandatory within the SLA, the system may give the QoS metric higher priority within optimization 408 than the perceived QoS. That is, in some embodiments the system may optimize perceived QoS only within a restricted space of operational allocations, e.g. those that satisfy the SLA, or subject to the constraint of satisfying the SLA. In some embodiments, the system may do so by treating perceived QoS as an offset (or penalty term) to a cost function or to provider operational allocations, as in the example of
In some embodiments, the system may make use of various other strategies to incorporate perceived QoS into the QoS optimization. In some embodiments, the system may optimize metric QoS, rather than restrict operational allocations to those that satisfy the SLA. The system may optimize both perceived QoS and a modeled QoS metric based on a stochastic constraint, offset, or penalty function model. For example, the system may combine perceived QoS with the QoS metric from the SLA with a an offset or penalty function model, by using a weighting factor that depends on proximity to the QoS metric. The system may then optimize using this combined measure. The system may also shift from optimizing for perceived QoS to optimizing for the QoS metric depending on proximity to the QoS metric. In some embodiments, the system may optimize for perceived QoS only when doing so increases costs by less than a predetermined threshold percentage or amount. In some embodiments, the system optimizes using a tradeoff curve for cost relative to QoS. In some embodiments, the system artificially raises the QoS requirements to provide a ‘cushion’ or margin of safety for perceived QoS.
In some embodiments, optimization 408 may take into account time or treat the QoS metric, perceived QoS, and/or provider operational allocations as time series, or may optimize the provider allocations over time, as illustrated by
A key to successfully optimizing perceived QoS is prediction. For example, to eliminate dips below a desired performance minimum, the system must predict its available resources and system loads well into the future, so it can ensure resources are available even at times of high demand. In addition, the desired minimum QoS will be based on a tradeoff among many factors, including the benefit in terms of enhanced customer satisfaction, the cost of resources required to achieve the desired QoS, the risks inherent in the uncertain future situation (e.g., expending resources but failing to achieve the desired QoS), and so on. Thus, advance planning is essential to mitigate these challenges.
In order to solve the optimization problem approximately, in some embodiments, the system may generate an optimized solution for the operational allocation time series, minimizing total cost and satisfying the SLA, for each value of xm in some range. The system may discretize the xm range in order to facilitate this computation. The system may then determine a global optimum by comparing the overall desirability of each such optimized solution (e.g., in this case by comparing the total cost including the offset) as a function of xm. Such a comparison is illustrated as the tradeoff curve in
To study this example in greater depth, assume that the electricity cost to the data center provider is given by p=Σici(a+bxi). Here ci is the electricity price at hour i as shown in
To incorporate the model of perceived QoS, the system may subtract the minimum allocation xm=minixi as an offset from the cost function, as described above: p′=p−kxm. This modified optimization can be solved by defining yi=xi−xm, so that for each given value of xm, the problem becomes miny
To consider a more complex example in the domain of data center operation, assume that QoS may be assessed taking into account a job's deadline, whether for an interactive or a batch job. That is, job i is expected to be completed within at most time Wi1 from being initiated, or else a financial penalty cp1 will be charged and a second chance will be granted, Wi2. If the application is not completed within time Wi2, a second penalty cp2 is applied and the job dropped. The SLA allows at most a number Dj of such failures for a client j. Even assuming the number of failures remains fewer than Dj, in order to satisfy the customer's subjective perceptions, it may be preferable to keep the number of failures as low as practically possible. Thus, in some embodiments, the system may seek to optimize perceived QoS, taking into account failures and/or timely job completion.
In an exemplary embodiment, the system may approach this optimization based on an offset (or penalty term) to a total cost function:
Here
is a count of dropped jobs. The counts
can be efficiently evaluated using techniques known in the queuing theory literature. The last term proportional to Cpinternal is the offset (or penalty term) representing a model of perceived QoS. Note that this perceived QoS term is in addition to the explicit financial charges cp1 and cp2 for late or dropped jobs, and that it increases very fast as the number of dropped jobs approaches the SLA limit Dj.
In some embodiments, the system may react to failure in a more real-time manner. Each time a failure occurs, the QoS may impose more severe constraints on the optimization algorithm. As time passes, a forgetting factor may reduce the impact of these failures on the QoS. One way to model this is that if tf is the time of failure,
the weight on the cost function in can be modified by a factor such as
e.g., cinternalp→cinternalpe−(t−tf).
Because the count terms are non-convex, in some embodiments, the system may use a simpler or more amenable expression. In some embodiments, the model may be replaced by: min Ei=1N(Ci(t)+qi(μ)c1p+q2(μ)c2p. Here μ is the rate at which service requests are removed from the queue; q1(μ) is the probability of a job failing in the first step (W1i≤
In some embodiments, SLA obtaining module 602 can obtain an SLA between a provider and a customer, specifying a QoS metric and a corresponding range. Perceived QoS modeling module 604 may model the customer's perceived QoS, based on an operational allocation or input by the provider. QoS metric modeling module 606 may model the QoS metric, based on the allocation by the provider. QoS optimizing module 608 may determine an optimized level of the provider's operational allocations that enhances the customer's modeled perceived QoS, wherein the optimized level also corresponds to a modeled QoS metric within the range. Provider's allocation setting module 610 may set the provider's operational allocations to the optimized level. Note that perceived QoS module 302 illustrated in
In some embodiments, SLA obtaining module 602 can obtain an SLA between a provider and a customer, specifying a QoS metric and a corresponding range. Perceived QoS modeling module 604 may model the customer's perceived QoS, based on an allocation by the provider. QoS metric modeling module 606 may model the QoS metric, based on the operational allocations by the provider. QoS optimizing module 608 may determine an optimized level of the provider's operational allocations that enhances the customer's modeled perceived QoS, wherein the optimized level also corresponds to a modeled QoS metric within the range. Provider's allocation setting module 610 may set the provider's operational allocations to the optimized level. Note that perceived QoS module 302 illustrated in
The data structures and code described in this detailed description are typically stored on a computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. The computer-readable storage medium includes, but is not limited to, volatile memory, non-volatile memory, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.
The methods and processes described in the detailed description section can be embodied as code and/or data, which can be stored in a computer-readable storage medium as described above. When a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium.
Furthermore, methods and processes described herein can be included in hardware modules or apparatus. These modules or apparatus may include, but are not limited to, an application-specific integrated circuit (ASIC) chip, a field-programmable gate array (FPGA), a dedicated or shared processor that executes a particular software module or a piece of code at a particular time, and/or other programmable-logic devices now known or later developed. When the hardware modules or apparatus are activated, they perform the methods and processes included within them.
The foregoing descriptions of various embodiments have been presented only for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. Additionally, the above disclosure is not intended to limit the present invention.