This invention relates to time-series forecasting in which a forecast system is provided with time-series data by using a single generic data entity to represent a plurality of different forecasting parameters.
As demand for more accurate forecasts rises, the size of data sets over which a forecast is to be generated can increase and a data set over which a forecast is sought may contain data with different contexts and/or at different levels of granularity. Modern forecasting algorithms not only have to cope with time-series data which may represent averages obtained over different periods of time, but different levels of granularity can occur in time-series data in dimensions other than time, for example, different spatial contexts (e.g. the area over which the data was collated) and/or other different contexts can occur. Time-series data parameters may also be related by hierarchies or more complex rule-based relationships. Where the time-series data over which a forecast is to be obtained differs in terms of the context (nature and/or relationship to other data and/or level of granularity and/or even data format etc., etc) and type of parameters, the data which does not conform with the forecast requirements may be ignored or subjected to pre-processing to map it into a form suitable for generating a forecast. If a forecasting algorithm incorporates means to pre-process data however, unanticipated variations in the context and/or parameter types of the time-series data over which a forecast is to be obtained requires the forecasting algorithm itself to be re-configured.
Adapting forecasting algorithms to recognize data having different contexts and to be also capable of utilizing such data (rather than simply ignoring it when determining a forecast) requires more complex programming which increases the cost of obtaining forecasts. The need to pre-process data also increases the time to generate a forecast, and only data which the forecast system developer anticipated being of a type capable of being pre-processed can be used to generate a forecast.
Technical fields relying on time-series forecasting include the automobile, aeronautical, medical and engineering fields. An example of a technical forecasting application is an application to forecast component failure (e.g. metal fatigue). Some technical applications use forecasts to anticipate a negative result which is then automatically compensated for using a feedback mechanism. In this way, forecast results can be automatically mitigated or obviated when undesirable. Physical systems may use fore-cast results in time-critical applications where the forecast must be determined rapidly to enable steps to be taken to prevent the unwanted result from occurring.
As a system grows, it may be necessary to amalgamate different data sets over which the forecast is to be obtained (or to incorporate different features which are subsequently found to impact the forecast). If an existing forecasting system cannot utilize the additional data, then no forecasts can be obtained until the forecasting algorithm is corrected or replaced to allow the additional data to be utilized. Customizing a time-series forecasting system so that it is able to provide forecasts for specific requirements can be a complex, costly task involving considerable reconfiguration of the forecasting algorithms. A forecast system designer may need to reconfigure conventional forecasting systems each as additional parameters are introduced or deleted from the time-series data used by the forecast model of the forecast system. As an example, consider the case where a forecast is required in order to ensure appropriate resources are available. In order to produce accurate forecasts, numerous parameters need to be considered like geographic area, type of resource. If reconfiguration of the forecasting algorithm is required each time the forecasting model is changed, cost and delay in forecast generation is incurred.
In many scenarios, the plethora of potential forecasting parameters may cause problems, as it may not be clear when the forecasting tool is being developed, which are required to ensure the forecast is satisfactorily accurate. Developing a time-series forecasting tool is a complicated task with many parts of critical importance and is usually undertaken by skilled application developers. Even so, dealing with new parameters on every customer instance is a costly and time consuming process.
In United States Patent Application No. US2002/0133385A1, entitled “Method and computer program product for weather adapter consumer event planning”, by F. Fox, D. Pearson et all., there is a specification of a system forecasting future retail performance in which a basic architecture consisting of an analyzer and a configurator which selects the specific parameters to be forecast over. However, if the parameters used in the model change, then the configurator will have to be modified accordingly, in addition to the required database changes. Similarly, in USA Patent Application No. US 2002/0169657A1 entitled “Supply chain demand system and forecasting”, by N. Singh, S. Olasky et all., a forecasting system is described which supports multi-scenario comparisons. However, this system uses different algorithms for different scenarios and does not deal with parameters in a generic and extensible way has not bee tackled.
The invention seeks to obviate and/or mitigate the limitations of known forecasting algorithms. For example, by obviating or mitigating the need to reconfigure a forecasting algorithm each time the model on which it is based changes by providing a generic forecasting tool which is able to accommodate any number and type of parameters, and which is able to modify existing parameters dynamically during the operation of the system. This reduces the skill set required to generate forecasts using time-series data having different contexts and/or parameters and/or parameter types (for example, where the time-series data has varying levels of granularity) by encapsulating the time-series data within a single generic data structure (via a forecasting data type). This encapsulation of data enables the forecast algorithm to be simplified as it removes any need for the forecasting algorithm to incorporate means to pre-process the time-series data. This simplifies the programming complexity of the forecasting algorithm, and enables faster forecasts to be obtained despite allowing forecasts to be generated from time-series which have differing contexts and/or parameters and/or parameter types (as the forecast data type can represent time-series data which comprises more than one type or level of data encapsulation). The need for the forecasting algorithm to pre-process data is removed as the time-series data is pre-processed (also known as being “groomed”) separately and is effectively provided in a pre-processed format to the forecasting algorithm. This also enables forecasts to be obtained using different data series dynamically without requiring the algorithm to be re-configured.
The invention also seeks to provide a forecasting data type (FDT) which abstracts all different forecasting parameters into a single entity. This enables forecasting systems to be developed which are as generalized as possible and remove the need for the application developer to have to modify the forecasting system for every new set of customer requirements.
A first aspect of the invention seeks to provide a method of populating a forecasting system with time-series data, wherein the context of the time-series data is determined by one or more parameters encapsulated within a forecast data type, the forecast data type being arranged to present the time-series data in a generic form independent of any context information to a forecasting algorithm of the forecasting system, wherein the time-series data is encapsulated to enable the forecasting algorithm to generate a forecast for the time-series dependent on said context, the method comprising:
The invention thus provides a way for a forecasting engine to utilize large and more complex time-series data. The forecasting engine receives data which is in a generic form and so avoids the processing burden associated with pre-processing time-series data into a form appropriate for generating a forecast over. The time to generate the forecast is thus reduced, enabling more forecasts to be provided in a given period. This is advantageous in technical fields where data prediction is time-critical. For example, if auto-correction to some component of a physical system is to be provided on the basis of the prediction from the time-series forecast, rapidly determining the forecast may be essential.
Mapping time-series data into a generic forecast data type is similar to “grooming” the time-series data for the forecasting system. As the system itself only perceives “groomed” data, additional data can be dynamically considered by the forecasting algorithm. There is no need to reconfigure the forecasting algorithm each time new types of data are to be included in the time-series data over which the forecast is to be generated.
In one embodiment, the number of parameters providing the time-series data with the pre-determined context is modified by the forecast data type during the operation of the forecasting system.
In one embodiment, the type of at least one parameter providing the time-series data with its pre-determined context is modified by the forecast data type during the operation of the forecasting system.
In one embodiment, the forecasting data type is arranged to provide a plurality of parameters which form a hierarchy.
In one embodiment, the forecasting data type is arranged to provide a plurality of parameters which do not form a hierarchy.
In one embodiment, said forecast system comprises a forecast application arranged to parse received parameters required by a forecasting model of the forecast system, and wherein the forecast data type (FDT) is arranged to enable said forecast application to parse a plurality of different parameters required by said forecast model to enable a plurality of different forecast strategies to be applied on said parameters without the need to reconfigure the forecast algorithm.
In one embodiment, the abstract FDT represents leaf parameters of the time-series data over which a forecast is to be obtained.
In one embodiment, the forecast data type represents leaf parameters in such a way that aggregate data can be determined dynamically and provided to the forecast algorithm.
A second aspect of the invention seeks to provide a forecast system comprising a forecasting application and a forecast model, the forecast model being arranged to access a plurality of differing types of parameter time-series, each differing type of parameter time-series being accessed in the appropriate context by the forecast model receiving a set of time-series database entries, in which the forecast model itself is not able to distinguish between different parameters.
A third aspect of the invention seeks to provide a forecast data type (FDT) arranged to provide a forecasting system with time-series data having a pre-determined context represented by a predetermined number of differing parameters, each having a predetermined parameter type, the forecasting system comprising a forecasting application arranged to parse the different parameters required by a forecast model of said forecast system, said forecast data type being arranged to provide said parameters in a relevant context to enable different strategies to be applied on said parameters without the need to reconfigure the forecast algorithm.
Another aspect of the invention seeks to provide a forecast data type object comprising an object of the forecast data type as claimed in claim 10, in which each FDT object associates four different logical entities which collectively apply the relevant context information to the leaf level parameter time-series data.
In one embodiment, one logical entity comprises a set of data arranged to identify the attributes of the FDT object.
In one embodiment, another logical entity of the FDT object comprises a set of data arranged to maintain the hierarchical relationship among all the identified attributes of the FDT object.
In one embodiment, another logical entity of the FDT object is arranged to store information associated with each identified attribute of the FDT object.
In one embodiment, another logical entity of the FDT object represents said time-series data used by said forecasting algorithm by retrieving appropriate historical values for said leaf level parameters from a data store comprising said historical leaf level parameters and their associated values.
In one embodiment, the FDT object is arranged to represent the historical values of a particular leaf level parameter to ensure the time-series data passed to the forecasting algorithm has an appropriate context.
In one embodiment, said logical entity comprises: a system generated identifier to identify each attribute uniquely; a description of the attribute; a data store associated with at least one other data store arranged to provide leaf-level parameter values.
In one embodiment, said logical entity comprises for each attribute: a primary key associated with the data store associated with the attribute; an attribute type; any parent attribute of the attribute; an attribute level; an attribute name.
In one embodiment, said logical entity comprises: the FDT identifier for said forecast data type object; at least one attribute type for the FDT object; and at least one attribute identifier for the FDT object.
In one embodiment, said logical entity comprises: a historical value for the FDT object; a time value associated with said historical value; and an FDT object identifier for said historical value.
Another aspect of the invention relates to a forecasting system arranged to be customized for specific forecasting requirements by customizing the population of the table entries of the forecast data type object aspect.
Another aspect relates to a method of populating a forecasting system with time-series data having a pre-determined context represented by a predetermined number of parameters, each having a predetermined parameter type, the forecasting system being arranged to generate a forecast for the time-series dependent on said context, the method comprising: retrieving the time-series data using a generic forecast data type object, said generic forecast data type object being arranged to provide said time-series in said pre-determined context, wherein said context may be presented by said fore-cast data type providing a variable number and type of parameters to the forecasting system without requiring the forecasting system to be re-configured to provide the forecast over the time-series data.
Another aspect relates to a method of forecasting using time-series data comprising the steps of: encapsulating one or more parameters representing the context of the time-series data within a forecast data type; presenting the time-series data using said forecast data type in a generic form independent of any context information to a forecasting algorithm of a forecasting system; populating the forecasting system with time-series data; and generating a forecast by a forecast algorithm of the forecasting system receiving said the time-series data from the forecast system, and using said data to generate a forecast; wherein, in said step of populating the forecasting system with time-series data, the time-series data is retrieved using a generic forecast data type object, said generic forecast data type object being arranged to provide said time-series in said pre-determined context, and wherein said context presented by said fore-cast data type is capable of changing by said fore-cast data type object representing a variable number and type of parameters to the forecasting system, wherein said fore-cast data type is arranged to provide a different number and/or type of parameters to the forecast system without requiring the forecasting algorithm to be re-configured to provide the forecast over the time-series data.
Another aspect relates to a method of pre-processing time-series data to populate a forecasting system independently of the type or context of the time-series data, wherein the forecasting system is arranged to generate a forecast for each type of time-series data dependent on said context and type, the method comprising the steps of: determining the context of each type of time-series data, wherein the context is represented by one or more parameters of one or more parameter types, mapping each time-series data to one or more forecast data type objects by encapsulating the time-series data and its context within a generic forecast data type, whereby said forecast data type objects are capable of presenting said time-series data and said context in encapsulated form to said forecasting system, whereby said forecasting system populated with said encapsulated time-series data and said context using said generic forecast data type object is arranged to process said received forecast data type objects to generate a forecast for said time-series.
In one embodiment, said time-series data pre-processed to populate said forecasting system includes data having differing contexts and/or capable of being differently encapsulated with their context(s), whereby said forecast data type objects are arranged to present said differently encapsulated time-series data and context(s) in a generic form to the forecasting system.
Another aspect relates to a method of operating a forecasting system to generate a forecast using a generic data structure, the generic data structure being arranged to encapsulate data at one or more different context levels, the method comprising: populating the forecasting system independently of the context of the time-series data over which a forecast is to be obtained, wherein the forecasting system is arranged to generate a forecast for each type of time-series data dependent on said context and type, by: determining the context of each type of time-series data, wherein the context is represented by one or more parameters of one or more parameter types, mapping each time-series data to one or more forecast data type objects by encapsulating the time-series data and its context within a generic forecast data type, presenting said time-series data in an encapsulated form to said forecasting systems using said forecast data type objects, and processing said encapsulated time-series data to generate a forecast for said time-series, wherein said forecasting system automatically determines from each received forecast data type the context for generating a forecast using the time-series data.
In one embodiment, the forecast for the time-series data is generated by the forecasting system at the same encapsulation level as the encapsulated time-series.
In one embodiment, the forecast system generates a forecast using said encapsulated data at a differing level of encapsulation from the encapsulated time-series.
Another aspect relates to a database of stored forecast data type objects, the objects arranged for use in any of the method aspects.
Another aspect relates to apparatus arranged to support the operation one or more computer programs, wherein said one or more computer programs, when implemented on said apparatus, are arranged to perform appropriate steps in any method aspect. Those skilled in the art will appreciate that the above aspects are as defined in the independent claims and that the aspects may be combined with each other and with any appropriate embodiments in any suitable manner apparent to those skilled in the art.
Thus the invention provides a sophisticated abstraction which hides the specific characteristics of any set of forecasting parameters, thus providing the system with a single and stable interface. Parameters can now be added, removed and even modified without any modification required to the forecasting application. Re-usability and extensibility of the system are therefore increased.
The preferred embodiments of the invention will now be described with reference to the accompanying drawings which are by way of example only and in which:
The best mode of the invention as currently contemplated by the inventors will now be described with reference to the accompanying drawings. Those skilled in the art will appreciate that, for clarity and brevity, features capable of providing equivalent functionality to the features of the embodiments described later hereinbelow which are apparent to those skilled in the art, are considered to be implicitly disclosed by the description, unless such features are explicitly excluded.
The term “time-series” is defined herein to comprise a collection of observations made sequentially thought time, for example, sales of a particular product in successive months and the work demand volumes over a next number of days. The term “time-series forecasting” is defined to comprise a method of computing forecasts based on present and past values of the series. The complexity of this process can range from something simple (like an average of the past 2 historic values) to more advanced mathematical concepts.
In the forecasting system 110 shown in
Those skilled in the art will appreciate that the distribution of data amongst one or more data stores may differ in different embodiments of the invention.
In order for a forecasting application to be able to implement a forecasting method, it must be provided with appropriate parameters. For example, if a forecast is to be made on the volume of job requests of a particular type over a particular future time period in a particular geographic location, then two parameters might be (a) the geographic location (where the job should be performed) and (b) the type of job. In order to generate the forecast over this set of parameters, both have to be considered by the corresponding forecasting application. In general, however, numerous parameters may be applicable to any time-series forecasting scenario.
The forecasting data type according to the invention represents specific parameters in a generic abstraction. The specific characteristics of any set of forecasting parameters are encapsulated by the generic forecasting data type enabling the forecasting system to be provided with a single interface to the data set on which the forecast is to operate. This enables parameters to be added, removed and even modified without any modification required to the forecasting application. Re-usability and extensibility of the forecasting system is increased.
For clarity, an embodiment of the invention will now be described which is highly simplistic but representative of the invention. In this embodiment, a forecasting application is supported by a suitable hardware platform will generate forecasts for future job requests using a number of data items. This requires the forecasting application to access database information which may be hosted on the same hardware platform as the forecasting algorithm and/or on one or more remote platforms. The database information will contain differing types of information; for example, historical data comprising the volumes (i.e., numbers) of previous job requests, forecast data comprising the forecasted volumes for future job requests, and data representing one or more external factors. An example of an external factor which may or may not influence the forecast is the weather, and thus weather information may also need to be accessed by the forecasting application. Those skilled in the art will appreciate that the database entries accessed may be retrieved from one or more databases, a term which is defined herein to comprise a collection of data records arranged in a systematic manner from which information can be retrieved by performing a look-up operation based on at least one parameter. The physical data records may or may not be co-located on the same physical platform.
The invention enables forecast applications to be developed in which the forecasting algorithm used to generate the forecast in step 2 operates on a generic data type, referred to herein as a forecasting data type. The forecasting algorithm is based on a forecast model (which generates the forecasts according to the particular characteristics of the forecast model). The forecast model is required to access the various time-series of the differing types of information represented as parameters in the database entries.
A forecast model according to the invention is able to access each differing type of parameter time-series in the appropriate context, as the forecast model will receive only set of time-series database entries. According to the invention, a forecast model is provided with does not have any intelligence to distinguish between different parameters etc. An example of History and Forecast database entries such as may be used for a conventional forecasting scenario are shown below in Tables 1 and 2 respectively:
In Tables 1 and 2, the two parameters (Geographic Area and Work Type) are followed by a specific date and a value. The value denotes how many jobs have been requested or how many jobs have been forecasted to be requested (History and Forecasts table correspondingly). The parameters may be hierarchical in the sense that “Geographic Area” is hierarchical as the United Kingdom “UK” is to be regarded as a parent of “London”, and “Manchester”.
This simple, two-level hierarchical structure is shown schematically on
Consider the case where parent #1 represents the “UK” geographic area parameter and leaf #1 London, and leaf #2 Manchester (these are all geographic areas within the boundaries of the United Kingdom).
In order to generate a forecast, the forecasting application must access and read the relevant parameter information and run the corresponding mathematical formulas. In a conventional forecasting application, the dependence of the forecasting application to access a specific set of parameters limits the extensibility and adaptability of the forecasting system.
In the above example, the forecasting application reads the combinations of all “Geographic Area” and “Work Type” parameters in order to operate. Forecasting on {Area=London, Type=A} and on {Area=UK, Type=A} are two different things. In this embodiment, the forecasting model will assume a strict hierarchical format (like the one for geographic areas shown in
In practice, more complex parameter relationships may exist, such as are shown for example, in
FIGS. 3A,3B,4A and 4B show schematically examples of the strategies which may be used to generate a forecast at a particular level of the parameter hierarchy.
When a forecast is to be generated at the leaf level, if this represents the finest level of granularity of the parametrised data, then generating a forecast can be done as shown in
Those skilled in the art will be aware that depending on the forecast model used, each of the conventional strategies shown schematically in FIGS. 3A,3B,4A, and 4B can give different results.
The forecast data type (FDT) provided by one embodiment of the invention is arranged to enable the forecast application to parse the different parameters required by the forecast model and enables different strategies to be applied on the parameters without the need to reconfigure the forecast application.
Effectively, by providing an FDT, the invention ensures that forecast data can be represented at the leaf level in the hierarchy of parameter values. A FDT is able to represent each of the different parameters at the leaf level such as is shown in the example parameter hierarchy illustrated schematically in
In
The information relating to the parent parameter in Table 1, i.e., UK-wide information is no longer required as the abstract FDT represents the leaf parameters in such a way that aggregate data can be determined dynamically as all parent parameter value represents the aggregation of the value of its dependent nodes. There is no need to store in the database the values corresponding to parameters at higher levels of the parameter hierarchy (i.e., any parameter which is not at a leaf node level).
By enabling the retrieval and aggregation of the forecasting data dynamically (i.e., after compiling, e.g., when the forecasting application is running), more complex information can be retrieved it is possible to aggregated data over longer time periods, over larger geographic areas, etc. etc.
The FDT is the only single parameter the software forecasting application needs to access. All the intelligence relating to the context of the data the forecasting model requires is encapsulated within the structure of the FDT. This means that one or more parameters can be added, removed, or modified without impacting the design and/or implementation of the forecasting application.
The invention does not remove the complexity of tampering with parameters from the forecasting system, but instead moves the complexity from the software part to the population of the database.
An important implication of this step is that application developers may now design and implement a product generic enough to accommodate a plethora of differing scenarios, as each scenario is customized by customizing the population of the FDT table entries. This is a simpler and more robust process compared to re-factoring the system's code.
The implementation of the FDT concept has two aspects: one related to the database and one to the software architecture. Database related issues are used during the configuration of the system, while the software issues are used during the generation of the forecast.
An application developer needs to be able to be able to generate specific forecasts, for example, to satisfy each customer's specific requirements. The number and type of parameters that describe the nature of the data to be forecasted may different considerably depending on the required forecast.
According to the invention, an FDT is an abstract aggregation of the different parameters related to the forecasting operation. As an example, consider the case where a forecast is required and historical data is available for only two parameters—geographic area and work type. This is the specific forecast data for the available parameters, which will be referred to herein as “Customer owned” data. A simple database structure must be defined to maps the FDT concept to these available forecast parameters.
In
In
The TB_DOMAIN logical entity 40 shown in
i Domain Type: System generated ID to identify each attribute (or in one embodiment the domain) uniquely.
v Domain Desc: Description of the attribute (e.g., in one embodiment the domain)
v Entity Name: Corresponding database table of the “customer owned” database.
v Key Column: Primary key for the physical entity. (primary key column name)
v Parent Key: Parent Key for the physical entity.
The term “primary” key is defined herein to refer to a unique identifier for an attribute of the FDT, for example, any value which distinguishes the different attribute entries in the relevant data store.
The term “domain” is used hereinbelow with reference to the drawings to refer to a specific embodiment of an attribute of the FDT.
For the given, example, one embodiment of an I_Domain_Type structure is shown below in table 4:
Also shown in
v Domain Id: Primary key of the corresponding Domain table.
i Domain Type: Type for the domain.
i Parent Domain: Parent Domain of the domain attribute.
i Level: Level of the domain attribute.
V Domain Name: Name of the domain attribute.
Table 5 below provides examples of the attributes which would appear in TB_DOMAIN_HIERARCHY for the given example:
The TB_FDT_DETAILS data set 44 shown in
i fdt id: FDT id for which the detail is mentioned.
i domain type: Domain type for the FDT.
v domain id: Domain id for the FDT.
Table 6 below provides examples of the information which appears in a TB_DOMAIN_DETAILS data set 44 for the exemplary scenario:
Finally,
i fdt id: FDT id for which the historical value is mentioned.
d date: The corresponding date
n value: The historical value for the corresponding FDT for the given date.
For the exemplary scenario, the TB_TIME_SERIES data set 46 in one embodiment of the invention is shown in Table 8:
As Table 8 shows, only the leaf level FDTs are stored in the database. The parents in the parameter hierarchy are computed at run time.
In
In order to configure the forecasting system, the tables shown in
The next part explains how the software is implemented in order to generate forecasts based on the concept of the FDT.
Once the database tables have been populated (configuration phase), the forecast operation can now be performed.
In
A more detailed embodiment of the invention in which a complex parameter relationship exists will now be described. Consider an example in which the FDT information is represented as follows:
Firstly, by a TB_DOMAIN, which in this example is the same as in the previous exemplary scenario, and which is shown below in Table 9:
The TB_DOMAIN_HIERARCHY is modified and it will be will now be modified. This is because we assume that the Work Type hierarchy is now changed from the flat A & B types into the structure shown in
This type of structure may be represented in TB_DOMAIN_ARCHITECTURE as follows:
The TB_FDT_Details data set 44 table will now change to include the following different Work Types:
If the application requests the generation of forecast at a higher FDT level, this is provided by aggregating the leaf level nodes appropriately, as is shown in
Thus in
In
In
Finally in 12D, FDT “1” shows that in order to get a complete system forecast, it is necessary to aggregate of all the leaf nodes.
Thus the invention provides a Forecast Data Type (FDT) which is related to the operation of time series forecasting. In order to produce accurate forecasts, numerous parameters need to be considered like geographic area, type of work etc.
The large number of potential parameters that may be required, in addition to the dynamics of an ever changing customer model, raise the need for a generic forecasting tool. The invention seeks firstly to be able to first accommodate any number and type of parameters, and secondly to be able to modify existing parameters during the operation of the system. By providing a single generic FDT, both issues are addressed as all different forecasting parameters are now abstracted into a single entity.
Those skilled in the art will appreciate that many minor modifications and functional equivalents exist to the features described herein above and the invention is to be considered to include all such modifications and functional equivalents where apparent to those skilled in the art.
Number | Date | Country | Kind |
---|---|---|---|
0502494.8 | Feb 2005 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB06/00396 | 2/6/2006 | WO | 7/11/2007 |