Analytical applications are increasingly being based on very broad sets of data. This often occurs because most systems of records deliver improved process information along integrated process chains, which join together attributes from transactional and master data. Potentially, each of these joined attributes could be relevant to discover insights regarding success drivers or critical situations within the data. As result, an end user is confronted with a large list of attributes, which potentially could be of interest for further exploration, without an indication of which attributes are more relevant than others.
Different types of data can be combined together and used as a basis for analysis. This can result in a list of attributes, which potentially ranges between dozens to hundreds of attributes, depending on the amount of data being used. Without system support, it is very difficult for an end business user to select the best attributes for further analysis.
Thus, there remains a need in the art for a system that allows users to distinguish between useful attributes and attributes that may not be relevant to the user. There also remains a need in the art for a system to allow for the analysis of multidimensional data by measuring the potential impact of each attribute on business success, to allow an end user to make a more educated decision based on the presented attributes.
A system and method are described herein that provide for system and method provide for performing impact analysis for influencing attributes in a sales forecasting system. The sales forecasting system uses integrated predictive and statistical methods to help measure the variance of relevant data sets to guide an end user to relevant influencing attributes. The sales forecasting system may perform a statistical analysis to derive a sequence for the influencing attributes, and display the attributes to an end user in a specific sequence based on the performed statistical analysis.
In particular, the exemplary embodiments and/or exemplary methods are directed to a system and method for measuring an impact of various attributes based on data distributions of a sales forecasting system. The system and method include the step of determining a list of influencing attributes based on retrieved historical data that is retrieved from storage in an in-memory database and measuring a data distribution of each of the influencing attributes. The system and method also include the step of sorting the influencing attributes using a variance analysis, where a measure of variance is determined for each of the influencing attributes. In particular an analysis of variance statistical model can be used.
The system and method also include the step of ordering the influencing attributes in descending order based on the determined measure of variance for each of the influencing attributes, with influencing attributes having higher measures of variances determined to be most relevant. The influencing attributes can also be ordered in descending order based on a determined F-value.
The measure of variance and/or F-value can be determined as a function of a sum of squares of all opportunities and a sum of squares of opportunities of an individual influencing attribute. The sum of squares of opportunities of the specific influencing attribute can itself be determined as a function of attribute values of the individual influencing attribute. The measure of variance and/or F-value can also be determined as a function of the number of attribute values of the individual influencing attribute, a mean expected value of the individual influencing attribute, and a mean expected value of all opportunities.
The system and method also include the step generating a list of attribute values from a selected ordered influencing attribute. In this case, upon a user selection of at least one attribute value from the list of attribute values, at least one generated graphical display can be generated to compare the at least one attribute value to another selected attribute value. The at least one generated graphical display can illustrate a heterogeneity of the data distribution of the selected ordered influencing attribute.
The historical data, opportunity data, and the analysis of variance statistical model can all be stored in an in-memory database. Some of the influencing attributes can be calculated instantaneously when the historical data is retrieved from the in-memory database.
An advanced business application programming (ABAP) system can also be used to access the stored historical, opportunity data, and analysis of variance statistical model from the in-memory database if needed. The sales forecasting application can also be implemented on an integrated business platform.
The subject matter will now be described in detail for specific preferred embodiments, it being understood that these embodiments are intended only as illustrative examples and is not to be limited thereto these embodiments.
Previous analytic applications were directed towards elaborated personalization and adaptation features that were ineffective in selecting the most pertinent attributes of sales orders and sales forecasting data. Embodiments provide a system and method for performing impact analysis for influencing attributes in a sales forecasting system. The sales forecasting system uses integrated predictive and statistical methods to help measure the variance of relevant data sets to guide an end user to relevant influencing attributes. The sales forecasting system may perform a statistical analysis to derive a sequence for the influencing attributes, and display the attributes to an end user in a specific sequence based on the performed statistical analysis.
In an example embodiment, application 20 may be an application that is implemented on a back end component and displayed on a user interface on user terminal 10. In another embodiment, the application may be a computer-based application stored in the main memory database of user terminal 10.
In an example embodiment, the system and method may include one or more processors 30, which may be implemented using any conventional processing circuit and device or combination thereof, e.g., a central processing unit (CPU) of a personal computer (PC) or other workstation processor, to execute code provided, to perform any of the methods described herein, alone or in combination. In an embodiment, the executed code may be stored in a main memory database of user terminal 10. In this example embodiment, the main memory database may be an in-memory database such as SAP HANA™, where data is stored in the main memory (RAM).
Database 35 may also include data, for example, pertaining to “Sales History”, “Current Pipeline”, and “Snapshot Data”, which may provide data that may be viewed in a graphical manner by an end user. It should be understood that the examples of stored data as illustrated in
In an embodiment, database 35 may also store a statistical method variance analysis (ANOVA) algorithm. The ANOVA algorithm may be applied as a stored procedure and may be used to join the various data.
The sales forecasting application 20 may be displayed on a user interface 25. User interface 25 may be designed specifically to provide an interaction flow to allow for combining the analytics on the retrieved data with visualizations derived from the retrieved data. In an embodiment, user interface 25 may be configured to display the integrated business platform such as SAP Business ByDesign™. The layout of the user interface 25 may be written in a plurality of programming languages. In an example embodiment, as illustrated in
In an embodiment, data may be directly accessed from database 35 by the application. In another embodiment, the data from database 35 may be accessed using an advanced business application programming (ABAP) system 40. ABAP system 40 may be a web-based service defined in an internet communication frame work and may issue a secondary database call to database 35 to access the stored data. In an embodiment, ABAP system 40 may also retrieve the stored ANOVA model.
As illustrated in
In order to focus on the most significant influencing attributes, the system may sort the attributes by relevance and display the attributes in selection field 115 based upon the sorting. In an example embodiment, the sort order, as well as the subsequent interactive analysis, may be driven by the assumption that a heterogeneous distribution of revenues across different groups, for example, different industries, is critical to differentiate between successful and unsuccessful business segments.
Sorting of the influencing attributes may be done by statistical methods that may be used to generate how the influencing attributes are presented to the end user. The statistical methods may also be used to guide the user through the numerous attributes in a time efficient manner. The highly relevant influencing attributes may be identified by measuring the data distribution, and thereby the heterogeneity of data. This may result in a sorted list of influencing attributes which may be displayed in selection field 115, with the most relevant influencing attributes on top. The sorting of the influencing attributes may be rooted in the assumption that heterogeneously distributed data is more relevant than homogenously distributed data.
In an embodiment, sales forecasting application 20 may use the ANOVA method for statistical analysis to measure the distribution of data. The stored ANOVA algorithm may be retrieved from database 35. A measure of heterogeneity may be derived using the ANOVA method to determine an F-value. Highly relevant influencing attributes may be identified by measuring the data distribution and categorized by a high F-Value. An F-value in the ANOVA statistical analysis may correspond to the ratio for the variance between items, here the influencing attributes, to the variance within items. This may be reflected in Equation (i):
The ANOVA statistical method may be suited to systematically help the user to understand if the different groups show different contributions to success compared to the group as whole. The statistical application of the ANOVA method as direct towards the influencing attributes and future opportunity data may reflected in Table 1.
The notation reflected notation in Table 1 may correlate to an example embodiment, where the influencing attribute is based on the industry for the sales. In an alternate embodiment, where another influencing attribute is used, the ANOVA method may define the opportunities as a function of that specific influencing attribute.
As reflected in Table 1, each of the opportunities in the opportunity data may have an expected value. In an example embodiment where industry is an influencing attribute, a mean expected value may be determined for all opportunities and for the opportunities in a specific industry. In an alternate embodiment, where another influencing attribute is used, a mean expected value for all opportunities and for the opportunities in accordance with that influencing attribute may be listed.
In the example embodiment where the influencing attribute is the industry type, a sum of squares for all opportunity data by industry may be generated by Equation (ii). The sum of squares may be represented as the summation of the observed expected value for all won opportunities (previous opportunities that translated to sales previously) in a specific industry.
A sum of squares for all opportunity data between the various industries may be reflected by Equation (iii). The summation for opportunities between the various industries may be a function of the difference between the mean expected value for a specific industry and the mean expected value for all opportunities.
A sum of squares for all opportunity data within each industry may be reflected by Equation (iv). The summation for opportunities within each industry may be a function of the difference between the mean expected value for a specific industry and the mean expected value for all opportunities.
The measure for the actual variance may be determined by Equation (v). The variance may be reflected as a ratio between sum of squares for all opportunity data between the various industries taking into consideration the number of industries and the number of opportunities lost, and the sum of squares for all opportunity data within each industry taking into consideration the number of industries.
The measure of variation may be used to sort the influencing attributes accordingly in conjunction with Equation (i). The sort order may allow for an end user to focus on the most relevant influencing attributes, by creating a sorted list of influencing attributes, with the most relevant on top of the list. As illustrated, in
The evaluation of the underlying data distribution by the ANOVA method may also generate chart 130 and graph 140, which serve of a confirmation of the distribution impact analysis, by displaying results in both percentage and absolute values.
After the ANOVA method has generated the sorted list of the attributes for display in selection field 115, an end user may scroll through the list of influencing attributes via a scroll bar to view the entire contents of the list. As depicted in
In an alternate embodiment, the sorting of the attribute values of the selected influencing attribute may also be made by the ANOVA method. The ANOVA method may sort the attribute values of the selected attribute in accordance with Equations (i-v) to determine the measure of variance for the attribute values and determined which attribute value may have a highest F-value. The attributes in selection field 125 may be arranged in accordance with their relevance, with the attribute values have the highest F-value and determined to be the most relevant, arranged at the top of the selection field.
In selection field 125, a user may select to perform an analysis of one or more attribute values. This may occur through the selection of multiple attributes in selection field 125. A user may scroll through the list of attribute values in selection field 115 via a scroll bar and click on one or more attribute values. In the example embodiment depicted in
The selection of the attribute value(s) in selection field 125 may generate a number of figures which may provide for a comparison of the attribute values. A zebra chart may be displayed in panel 130. This zebra chart may graphically display a segmented comparison of the relative shares of each attribute value, as percentage, of success of the sales orders. In panel 140, a bar graph may be displayed comparing the attribute values. The bar graph in panel 140 may, for example, graphically display the absolute contribute that each of the further limiting attributes to total revenue. In another embodiment, a graphic display may be generated depicted the growth rate for each of the attribute values.
A user can interactively review the different influencing attributes and study the related distributions, statistical measures, and business trends provided by the generated figures.
The exemplary method and computer program instructions may be embodied on a machine readable storage medium such as a computer disc, optically-readable media, magnetic media, hard drives, RAID storage device, and flash memory. In addition, a server or database server may include machine readable media configured to store machine executable program instructions. The features of the embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof and utilized in systems, subsystems, components or subcomponents thereof. When implemented in software, the elements of the invention are programs or the code segments used to perform the necessary tasks. The program or code segments can be stored on machine readable storage media. The “machine readable storage media” may include any medium that can store information. Examples of a machine readable storage medium include electronic circuits, semiconductor memory device, ROM, flash memory, erasable ROM (EROM), floppy diskette, CD-ROM, optical disk, hard disk, fiber optic medium, or any electromagnetic or optical storage device. The code segments may be downloaded via computer networks such as Internet, Intranet, etc.
Although the invention has been described above with reference to specific embodiments, the invention is not limited to the above embodiments and the specific configurations shown in the drawings. For example, some components shown may be combined with each other as one embodiment, or a component may be divided into several subcomponents, or any other known or available component may be added. The operation processes are also not limited to those shown in the examples. Those skilled in the art will appreciate that the invention may be implemented in other ways without departing from the spirit and substantive features of the invention. For example, features and embodiments described above may be combined with and without each other. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive. The scope of the invention is indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.