Modern businesses often purchase relevant data from various market research companies. Market data may be used by the businesses to make decisions. For example, market data may indicate how well products are selling well in a target market, or whether market shares are growing, etc., which can be useful for a business reviewing its strategic plans. The market data can be provided via databases. For example, a local or regional department of a company may use the data for local market share analysis for the region and/or countries for which the local department is responsible. In this situation, the local department may extract data from a single database and generate a report based on that database. On the other hand, a global marketing development may consolidate different databases to build a global picture for global market share analysis.
That is, a user may receive market data from several different databases of different market research companies for different categories of market data, and for different countries or regions. As market research companies typically compile and deliver the market data at different time intervals, not all of the desired data may be delivered and/or available at the time when a company wishes to review the data. The end-user company may need to understand the costs and benefits of different publication dates for a specific group of data sources. The choice and use of particular publication dates and times may cause the display of data to be more or less helpful for a user. For example, if global market share values are updated every time a new database becomes available, global marketers may become confused. Data consolidation poses particular challenges, due in part to varying data delivery periods.
Databases typically have different delivery cycles and time granularities such that data from different databases may not always be available at the same time. However, some data reporting applications rely on a variety of sources and databases to assemble a report, e.g., for a user. Thus, a situation may arise in which data records (“records” for simplicity”) are available (or delivered) at a particular point in time from a few of the databases, but not available (or delivered) from other databases.
Reporting software such as SAP® Demand Signal Management enables a user to execute a periodic report by offering an extrapolation function. The extrapolation function may temporarily replace missing records or records of poor quality until a next delivery or a corrected delivery is available. For example, an incomplete data set may temporarily be remedied by extrapolating the missing data values using most recent available data for a specific category and a specific country/region. However, if a high proportion of data is extrapolated, then the reports based on the data may be inaccurate. On the other hand, it may be inefficient to wait for complete data, i.e., to avoid extrapolation because timeliness of reporting is important for making business decisions.
The period selected for reporting, e.g., a data display refresh rate, may affect the usefulness of a report. If the refresh rate is too high, e.g., faster than the rate of analysis, then the information can overwhelm the system and cause inaccurate calculations. Therefore the data may be made public from time to time, e.g., when global data is sufficiently complete for the last calendar month. Thus, the inventors recognized a need for a system to select a publication date that strikes a balance between timely reporting and accuracy. An example of a selected publishing date may be one that does not include too many extrapolated values, where “too many” can be a defined threshold value. In other words, the inventors recognized a need for a system that can easily allow the user to customize and optimize the publication schedule of data from various data sources.
The present methods and systems may make decisions or assist decision-making regarding ways to customize and optimize a publishing date. For example, the system may automatically calculate the number of days that would be extrapolated for various publishing dates. The automatic calculation may be based on planned data delivery dates. An overview of the amount of data for each database covered in a reporting period, for each possible publishing date, may be rendered on a graphical user interface (“GUI”). The methods and systems may calculate how much data would be extrapolated and may visualize these values in the form of graphs and/or charts. One metric that may be displayed is how many days' worth of data needs to be extrapolated. The publishing date may also be referred to as a “reporting date.” For example, in some embodiments, the publishing/reporting date may serve as a basis for determining a publishing date. That is, a report need not be published on the “publishing date,” and the publishing date serves as a reference for an actual publishing time.
The source data 160 may be received from one or more sources such as companies providing market research data, for example Nielsen®, IRI®, and/or GFK®. The various data sources may operate on different delivery schedules, for example providing reports at different frequencies. An example delivery schedule is shown in
The regional interface/cache 110 (“regional cache” for simplicity) may receive data from one or more sources 160. The local cache may be a memory storage where data, e.g., raw data, from sources regarding different regions or product categories may be received and stored. Such raw data may be in various formats, and/or may each be corresponding to different global categories and/or different countries/regions. In the example shown in
The consolidation and transformation module 120 may receive data from the regional cache 110 and process the data. Consolidation and transformation may include aggregating and/or extracting relevant data and harmonizing different delivery schedules of the source data. The consolidation and transformation may also include extrapolation and time split, as further discussed herein. The consolidation and transformation may allow the data to be better reported together. For example, the transformer may remove irrelevant data from each of the source data CA, US, DE, UK, and FR, and may perform time harmonization such as “time split,” as further discussed herein. For example, regional offices may be interested in more specific details provided by the source data 160, while a global office may be interested in a more general and high-level view of the data from the various regions. Consolidation and transformation 120 may extract data more relevant for examination by a global office.
The global cache 130 may receive data processed by the consolidation and transformation module 120. The global cache may be a memory storage where data, e.g., processed data from sources regarding different regions or product categories may be received and stored. Such data may be in various formats, and/or may each be corresponding to different global categories and/or different countries/regions. In the example shown in
The global report processor 140 may process data according to the methods described herein. For example, the global report processor may define a publishing group and select a publishing date, which are respectively represented as a grouping module 142 and a date selector 144. The data processed by the global report processor 140 may be used to generate reports 150. The processor 140 is represented as a “global report processor” because the data generated may be of interest to global or “central” offices. However, “global report processor” may also be understood to include processing of data for regions encompassing sub-regions and processing of data for purposes other than reporting.
The grouping module 142 may select or describe data sets that can be grouped for global reporting. For example, data may be grouped by countries and/or categories. The categories and/or countries may be a “view” or a “filter” of a dataset. Various publishing groups for desired subsets of data may be defined by a user or may be defined automatically, for example as shown in
The date selector 144 may analyze possible publishing dates and including factors provided by the consolidation and transformation module 120 and/or user input. The date selector 144 may automatically select a publishing date based on a weighing of various factors further discussed herein. The selection of a publishing date may be based on a grouping of categories and/or countries, as represented by the dashed arrow. For example, a grouping may indicate that the delivery schedules of the databases reflecting the countries and/or categories within the grouping are more heavily weighted for selecting the publishing date.
The processor may also include a controller (not shown) that may render a GUI, respond to controls from a user of the DSiM system, send commands to the elements of the DSiM system, and output data. For example, the controller may query and/or receive user-defined settings for the system 100. The controller may customize and optimize the publishing of data based on internal calculations and/or user-indicated settings. The controller may send commands to a display device to display a GUI, and may allow the user to receive the information from the system 100 and to receive user inputs. The controller may render a list of possible publishing times to select from and/or the amount of data extrapolation required for each publishing time.
In operation, the regional cache 110 may receive and/or store raw data from external source(s) 160. A user or system administrator may define for each raw input database 160, planned data delivery dates. The dates may be agreed upon with a data provider. The agreement may also define time ranges corresponding to each delivery date. Alternatively, the system 100 may receive this information from the data provider automatically, which may be periodically updated. For example, the system 100 may receive regular forecasts from the data providers as to when the data deliveries are scheduled. This information may be a basis for generating possible publishing dates for the user for each publishing group. The consolidation and transformation module 120 may process the raw data, such as reformatting, filtering, compacting, and extrapolating the data. This may reduce storage requirements. The result of the consolidation and transformation may be received and/or stored by global cache 130. The global report processor 140 may transform a subset of the data from the plurality of different data sources for reporting, according to the methods further discussed herein. For example, the global report processor may define publishing groups and select a publishing date.
In an alternative embodiment, global report processor 140 may directly receive source data 160 and process the data according to the methods discussed herein, without receiving via a regional cache 110 or consolidation and transformation. In another embodiment, the results of the global report processor 140 may be stored instead of or in addition to providing the data for reports 150.
Database DE may contain weekly data provided 12 times a year following a 5-4-4 pattern, i.e., the first delivery contains five new weeks, the next two deliveries each contains four new weeks, the subsequent delivery contains five new weeks, etc. Data may be reported on a weekly level because data is available at that time granularity. In
Database US may contain data for four weeks (“4-weekly”) provided 13 times a year. In
Database UK may contain data for two months (“bi-monthly”) provided six times a year. In
Each of the delivery dates corresponding to the databases DE, US, and UK are shown in the timeline at the bottom of
At block 322, the method 300 may determine a planned delivery date. A delivery date may be determined/defined for each database. This may also be referred to as determining a delivery schedule. The delivery date may be the same for each database or different between at least two of the databases. The planned delivery date may be a date agreed upon with a data provider. The planned delivery date may also be defined externally and received by the method 300. The method 300 may then determine what data will be included for a given planned delivery date (box 324). For example, the definition may be up to what date the delivery contains data. In the examples provided in
At block 346, the method 300 may select categories and regions to be published together. For example, the method may select global categories and countries to be published together. The method may select combinations of categories and regions to be part of a same publishing group. The selections may be made externally and received by method 300. The selections made in box 346 are further described herein with respect to
At block 362, the method 300 may simulate a publishing date. The simulation may include gathering data from various databases. Based on the gathered data, the method 300 may determine an amount of data to be extrapolated for a given publishing data (box 364). Data may be considered to be needed to be extrapolated if it is missing or of poor quality. The determination of the amount of data that needs to be extrapolated is further described herein with respect to
Optionally, the method 300 may receive input, for example via a GUI (box 368). The input may cause the method 300 to proceed to box 362 to perform additional simulation(s). The input may be changes to publishing dates and database. For example, a user may input different publishing dates, e.g., moving the publishing date to a later point in time when more data may be available and less data would need to be extrapolated. Alternatively, the publishing date may be moved to an earlier point in time. The method 300 may then repeat boxes 362 to 366, and may determine that additional extrapolated days are tolerable. Thus, it is possible to find a balance between publishing a report at an early time with a tolerable level of extrapolated data. Filtering possibilities draws a focus on more important databases, e.g., databases more relevant to a publishing group. Decisions may be stored to memory. In an embodiment, the process 300 may be performed at a predefinable time period. For example, the process 300 may be performed for each calendar month, e.g., each GUI may be for a calendar month. As another example, the process may be performed weekly.
For example, publishing group “North America” includes global groups CA, MX, and US, and categories GC1 and GC2. A GC category may represent a global category. For example, a global category may be chocolate, another global category may be milk, yet another global category may be body care, etc. The corresponding data is gathered from databases 402, 403, 404, and 405, each of which belongs to at least one of the global groups and at least one of the categories GC1 and GC2. Publishing group “GC 1 Top 3” includes global groups FR, UK, US, and category GC1. The corresponding data is gathered from databases 405, 406, and 408, which each includes at least one of the global groups and GC1. Publishing group “Europe” includes global groups DE, FR, NL, PL, and UK, and the category GC1. The corresponding data is gathered from databases 406, 412, 414, 416, and 418, each of which includes at least one of the global groups or GC1.
For a given reporting period, not all of the data may be available. For a report covering data from the month of March, a potential reporting period 510 may be designated. In this example, database DE contains weekly data provided 12 times a year following a 5-4-4 pattern, i.e., the first delivery contains five new weeks, the next two deliveries each contain four new weeks, the subsequent delivery contains five new weeks, etc. Database US contains data for four weeks (“4-weekly”) provided 13 times a year. Database UK contains data for two months (“bi-monthly”) provided six times a year. The diamonds, circles, and triangles represent publishing dates of each of the databases. For example, a report may be published around week 14 for database DE (represented by a diamond). Another report may be published around week 19 (represented by a diamond).
Three sample planned publishing dates are shown in
In the example provided, to report March data based on fully reported data (i.e., without any extrapolated data), one would need to wait until point C, because at that point in time, data is made available by each database: database DE reports the data around week 14 (represented by a diamond labeled “4”), database US reports the data around week 15 (represented by a circle labeled “3”), and database UK reports that data around week 18 (represented by triangle labeled “2”). However, point C is in the middle of May, which may be too late for an intended audience of the report. In other words, reporting at point A corresponds to extrapolating data for all three databases, and reporting at point B corresponds to extrapolating data for one database, and reporting at point C corresponds to not extrapolating data for any of the databases because data for the desired period has been delivered by all databases.
The method may determine and analyze benefits and shortcomings of each situation. For publishing date A, March data may be delivered by database DE only so that part of database US would be partly extrapolated and UK would be completely extrapolated to generate a report. An advantage is that the report would be generated earliest out of the three choices, A, B, and C. For publishing date B, March data may be delivered by databases DE and US, so that UK would be partly extrapolated to generate a report. For publishing date C, March is covered by all three databases, but it may be relatively late because the information for March would be published in May. The method may assign a score to each option based on the cost-benefit analysis and determine that option B is the most desirable because some of the data is available and the date is reasonably early.
The user may manipulate and change the publishing date to a later point in time when more data is available and less data need to get extrapolated or navigate to an earlier point in time if more extrapolated days are tolerable. Thus, it is possible for a user to easily and quickly visualize the effects of variables related to a publishing date and find a balance between an amount of extrapolation and timeliness of publication.
The methods and systems discussed herein include many advantages. For example, the methods and systems allow for finding an optimal point in time to make market research data available for reporting, e.g., global reporting. Finding such an optimal point in time is useful for many marketing applications. The methods and systems may also assist a user to make and/or supervise a better decision compared with typical methods. For example, a user may define business-relevant publishing groups and run simulations of different publishing dates to more efficiently understand the impact of selecting various publishing dates. Such calculations would not be capable of being performed in one's mind because of the volume of data and simulations.
In
It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a computer processor executing software instructions, a mobile device, a smart device, a mobile application, or a computer readable medium such as a computer readable storage medium, or a computer network wherein program instructions are sent over optical or electronic communication or non-transitory links. It should be noted that the order of the steps of disclosed processes can be altered within the scope of the invention, as noted in the appended Claims and in the description herein.
The foregoing discussion has described operation of the embodiments of the present invention in the context of terminals that embody downloading systems. Commonly, these components are provided as electronic devices. They can be embodied in integrated circuits, such as application specific integrated circuits, field programmable gate arrays and/or digital signal processors. Alternatively, they can be embodied in computer programs that execute on personal computers, notebook computers, tablet computers, smartphones or computer servers. Such computer programs typically are stored in physical storage media such as electronic-, magnetic- and/or optically-based storage devices, where they are read to a processor under control of an operating system and executed. And, of course, these components may be provided as hybrid systems that distribute functionality across dedicated hardware components and programmed general-purpose processors, as desired.
Several embodiments of the invention are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention.