1. Field of the Invention
The invention relates to a method and arrangement for predicting measurement data using given measurement data.
2. Description of the Related Art
A technical system often requires facilities for forecasting based on known (measurement) data, particularly in the context of error susceptibility or cost estimates.
Forecasts generated by experts are generally subject to errors. Experts cannot carry out exact analyses, at least of highly complex systems.
A stochastic point process, in particular a Poisson process, is described in Sidney I. Resnick: “Adventures in Stochastic Processes”, Birkhtäuser Boston, 1992, ISBN 3-7643-3591-2, pp. 303-317 (Resnick).
The object of the invention is to allow the automatic prediction (forecast) of measurement data using given measurement data.
This object is achieved in accordance with the method and apparatus described below; developments of the invention are also described in the following text.
In order to achieve this object, a method is provided for predicting measurement data using given measurement data, in which a stochastic process is matched to the given measurement data. Simulation runs are carried out from a given time-point until a final time-point. The forecast measurement data is determined for each simulation run. Measurement data for the final time-point is predicted within a range of values, which is governed by the forecast measurement data.
One development is to define a confidence range for the prediction of measurement data, where the a % lowest and b % highest forecast measurement data are eliminated. In particular, a % can equal b %. For example, a 95% confidence range can thus be defined by ignoring the 2.5% lowest and 2.5% highest forecast measurement data.
One advantage is that the measurement data can be predicted (forecast) with an accuracy that is within a confidence range, from a given time-point. This makes it possible to identify, e.g., the feasibility or impossibility of a task associated with the measurement data, at an early stage. Appropriate measures can therefore be initiated in order to counteract forecast impossibility.
This is particularly important in the case of a complex system, e.g., a software development process, where the extent to which a schedule can be followed before the software is completed can be shown in a subsequent test phase. Even more important in this context is the ability to adopt countermeasures at an early stage if a delay has been clearly identified, e.g., in an integration test phase. This firstly affects the feasibility of the specified deadline (timescale) and secondly directly affects costs, since non-compliance with the agreed timescale often results in additional costs.
One refinement is for the stochastic process to be a non-homogeneous Poisson process.
In particular, the measurement data may in one refinement comprise numbers of errors. This applies to software development, for example, where the level of maturity is documented in accordance with the errors measured in a test phase. Completion is directly dependent on this level of maturity. In other words, the software cannot be delivered to customers until most of the errors have been removed from the software. This is particularly important with regard to resources (required to test and correct errors) and costs (due to delayed delivery).
In order to achieve the object of the invention, a method is also provided for predicting measurement data using given measurement data, in which a stochastic process is matched to the given measurement data. A range is ascertained, by sorting the probability values generated by the stochastic process according to size, around an expected value. Measurement data is predicted on the basis of this range, and in particular the probability values within the range.
One development is for the probability values generated by the stochastic process to be sorted symmetrically by size around the expected value. In particular, this means that the highest probability value represents the middle of the range, i.e., the expected value, whereas the next highest probability value is arranged to the right or left of the expected value. The next highest probability value is then arranged symmetrically on the other side of the expected value, in turn.
This analytical (design) procedure provides a range, where the breadth of the range in turn indicates which probability values are significant in the prediction of the measurement data.
In one particular refinement, the breadth of the range is determined by ignoring the probability values that lie below a given threshold.
This produces a range (confidence range), which has a specific breadth as a result of the threshold. This breadth corresponds to the certainty with which the measurement data is predicted.
If one assumes that the stochastic process is a non-homogeneous Poisson process, then the non-homogeneous Poisson process defines a step size, particularly on a time axis t, which indicates when the next error will occur. One characteristic of the non-homogeneous Poisson process is that it has no memory, so that a “no-memory” search is carried out from each error that occurs at a specific time-point, for a time-point that indicates the next error.
In order to achieve the object of the invention, an arrangement is also provided for predicting measurement data using given measurement data that has a processor unit and is configured in such a way that:
In order to achieve the object of the invention, an arrangement is further provided for predicting measurement data using given measurement data that has a processor unit and is configured in such a way that:
The arrangements are particularly suitable for carrying out the inventive method or the developments described above.
Exemplary embodiments of the invention are shown and explained below with reference to the drawings, in which:
In order to be able to forecast a number of expected errors in a technical process, e.g., in a software development process, non-homogeneous Poisson processes (NHPP) are calibrated (i.e., matched to measurement data, such as the occurrence of errors over time) as follows:
The following equation describes a counting process associated with the stochastic point process (non-homogeneous Poisson process):
{N(t)}tεR
In the present case, where equation (1) represents a non-homogeneous Poisson process, the following equation (cf. Resnick)
Since the nature of the Poisson process dictates that the increases (error increases in this case) are independent of previous increases, equation (5) for the time-points t>t0 to define a (minimum) range
[gu, go]=[gu(t), go(t)]⊂N0 (8)
Due to the unimodal nature of the Poisson count density, a range [gu, go] can be determined as follows:
Step 1: Sort the elementary probabilities
pl:=P(N(t)−N(t0)=l), l εN0
Step 2:
Step 3: Determine an index set
Step 4: Substitute
The range from equation (8) is also referred to as the forecast range.
Stochastic Simulation (second approach)
It is possible to determine the confidence range described using simulation, with the following steps:
Step 1: Start independent simulation runs based on the selected process model at time-point t0 of the last error message m ε N;
Step 2: End a simulation run as soon as the required final time-point te is reached;
Step 3: Repeat Step 2 until all simulation runs are finished;
Step 4: Sort the numbers {circumflex over (N)}i(te) of the errors generated in the i-th simulation run in the time period (t0, te), i=1, . . . , m, in descending order, and label the values sorted thus {circumflex over (N)}(1)(te), . . . , {circumflex over (N)}(m)(te); and
Step 5: Substitute
i.e., eliminate the (100·(1−α)/2) % lowest and highest values.
This produces the confidence range directly.
Each individual simulation run is based on a simulation algorithm, which is known from (cf. Brately, et al., 1987):
The simulated generation of intermediate arrival times for a non-homogeneous Poisson process is as follows:
Step 1: Substitute
Step 2: Generate a (pseudo) random variable X that is exponentially distributed with the parameter {overscore (λ)}, i.e., x :=−log(U)/{overscore (λ)}, where U is equally distributed over (0,1);
Step 3: Generate a random variable U that is equally distributed over (0,1); and
Step 4: If U≦λ(ts+x)/{overscore (λ)}, then substitute t=ts+X; otherwise substitute ts=ts+X and go to Step 1.
The example graph in
The intensity i is normally derived from equation (10) for λ. For example the result is as follows:
a) λ(t)=a·b·c·exp(−btc)·tc−1
(λ(t) is strictly monotonously descending for c≦1, and unimodal for c>1 with a definitive maximum at a point
b) Otherwise, {overscore (λ)} is derived in accordance with the above is comments as follows:
The graph in
The C programming language is used in the following examples, which show an algorithm to define confidence ranges for forecasts and an algorithm for simulated definition of confidence ranges for forecasts.
Program 1:
Program 2:
Program 3:
The above-described method and apparatus are illustrative of the principles of the present invention. Numerous modifications and adaptations will be readily apparent to those skilled in this art without departing from the spirit and scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
198 58 093 | Dec 1998 | DE | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCTDE99/03955 | 12/10/1999 | WO | 00 | 8/7/2001 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO0036426 | 6/22/2000 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4979118 | Kheradpir | Dec 1990 | A |
5726907 | Davidson et al. | Mar 1998 | A |
5891131 | Rajan et al. | Apr 1999 | A |
6061662 | Makivic | May 2000 | A |
6477471 | Hedstrom et al. | Nov 2002 | B1 |
20010013008 | Waclawski | Aug 2001 | A1 |
Number | Date | Country |
---|---|---|
195 30 647 | Jan 1997 | DE |
196 10 847 | Apr 1997 | DE |
0 786 725 | Jul 1997 | EP |