This invention relates to in-season promotions and markdowns.
Retailers use in-season promotions and markdowns to manage seasonal and perishable stock, which needs to be cleared out by the end of a season or before its expiration date. However, markdown and in-season promotions are generally determined using unscientific approaches that are not data-driven. This results in blanket promotions across an entire category of products or a set of stores or indeed a region of stores. These approaches lead to discounts that may be too deep, e.g., high discount relative to a more suitable discount. Too deep discounts erode a store's or indeed an entities profitability and may lead to out-of-stock warnings for some products and, at the same time, too shallow discounts for other products, leading to excess inventory at end of season and excess costs to clear the unsold inventory.
According to an aspect, a computer system includes a markdown engine that allows for optimizing of an allocation of markdowns across plural levers, the markdown engine including instructions that configured the computer system to receive input data that includes historical and markdown scope information about the products and store, price information as well as stock, stock-out information, prepare the received input data into data structures, cluster the data structures according to product metrics into one or more clusters; for each of the one or more clusters determine a demand forecast according to a markdown plan for the one or more clusters, optimize the markdown campaign plan with respect to one or more optimization goals and constraints for the one or more clusters, and output an optimized recommended set of markdowns for the one or more clusters.
One or more of the following embodiments or other embodiments disclosed herein may be included with the above aspect.
Instructions to cluster further includes instructions to determine quantifiable relationships in products and group products into the one or more clusters according to the quantifiable relationships.
The information to cluster products includes product class, product categories, and product metrics. The information is arranged in a vector defined by the product class, product categories, and product metrics. The instructions to cluster determine distances between vectors that represent a stock keeping unit.
The instructions further include instructions to cause the computer system to optimize proposed discounts sent from the demand forecast, by solving a mixed integer programming mathematical optimization that defines a target function to maximize as margin or margin penalized by leftover stock and defines a set of constraints that ensure that a solution found by the optimizer is applicable.
The instructions further include instructions to cause the computer system to re-optimize a current discount path for each stock keeping unit of a set of stock keeping units, based received deviations for the current discount path and received data regarding new sales of the set of stock keeping units.
The instructions further include instructions to cause the computer system to monitor performance of the markdown plan versus an initial version of the markdown plan and re-optimize the markdown plan when the optimizer determines that there is a deviation in the monitored performance versus the initial version of the markdown plan.
The instructions further include instructions to determine a demand forecast and for select items for picklist selection.
Other aspects include computer program products and computer implemented methods.
One or more of the above aspects may provide a markdown and lifecycle management tool that enables merchandisers to maximize margins by determining the right time and discount for every product including new products for which there is insufficient or no data. Given a set of input data, such as historical sales, prices, stock levels and costs, the markdown and lifecycle management tool provides an optimal discount path for each product including new products in order to maximize a pre-define business objective (e.g., margin maximization) while satisfying business and operational constraints to ensure applicability of proposed discounts.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Referring now to
The markdown engine 12 also includes product clustering engine 18. The product clustering engine 18 receives the data structures 15 from the data preparation engine 16 and produces from the data structures 15 product clusters (See.
Referring now to
The markdown process 40 includes receiving 44 input data from various sources. The markdown process 40 prepares 46 the received input data by loading the input data into suitable data structures. Examples of data structures are set out below:
The first data structure is a date store SKU (Table 1) that contains information about the products (both historical and markdown scope). The date store SKU also contains, store and price information as well as stock, stock-out and tax information. In the US, the tax information may be applicable according to states whereas, in the European Union, the tax information is the VAT for that country. Other countries will be similar to one or both of the above.
Table 1, the date store SKU table includes the following entries.
Other tables include the following tables:
Where footfall is defined as a number of persons entering a store or shopping area in a given time period, e.g., persons per hour, day, week, month, etc.
The markdown process 40 also includes product clustering 48 that outputs a markdown strategy for new products which have no sales history, as new transactions and inventory information becomes available. Product clustering 48 involves finding quantifiable relationships between products by understanding physical similarities and historical patterns.
The markdown process 40 also includes picklist selection 49 to determine which products to include. For example, the data processing system 10 analyzes and parses data items representing products to generate the list selection. These data items specify attributes of the products, such as lists of the potential attributes. To identify products to include in the list selection, the data processing system reads, from a hardware storage system, data structures representing rules specifying conditions, (defined in terms of attributes) to be satisfied for inclusion and/or a discount. Products are selected for discount, by descriptive features such as season, category and other attributes and performance such as selling speed, coverage weeks (stock/average sales), etc. For example, certain products will generally not be discounted whereas other products may be included in separate promotions. The markdown process 40 also includes demand forecast process 50 that strengthens relationships between clusters, detects promotion fatigue and accounts for seasonal changes and weather implications. The picklist selection 49 determines the articles to be included in the markdown process. The picklist selection 49 is an input to a demand forecast, which involves determining a sales volume forecast over time per discount.
The markdown process 40 also includes optimize process 54 to provide an optimal solution. Optimize process 54 takes into consideration a variety of parameters. Optimize process 54 provides a forecast per article, per store, at each discount level. Optimize process 54 uses a fully automated output that operates at a high level of granularity. With optimize process 54 complex problems can be solved over many permutations.
The markdown process 40 also includes a react process 58 that continuously learns and improves the markdown model. The markdown process 40 outputs 60 a list of prices and suggested discount paths, forecasted sales, profit and volume. React process 58 is used to significantly improved accuracy that allows a human input to calculate possible effects of possible constraints or to allow a user to overrule possible constraints.
Product Clustering
Product clustering finds and combines product similarities to predict demand for new and unseen products. Clustering determines quantifiable relationships in products, and groups products into clusters according to the strength of these relationships. Certain variables may be excluded, depending on user requirements. Clustering involves an advanced AI system discussed below that compares different elements from multiple products and takes a percentage-based similarity calculation to determine an overall cluster. Some of the relationships are based on product hierarchy, while others are based on physical properties and other factors.
Example information to cluster products include product class, product categories, product metrics, e.g., for the product class of footwear, material, (e.g., outer material/filling material) cover sole, outer sole, shape, heel height, color, size, etc. These product metrics are used in determine distances between points P, discussed below.
Referring now to
As a generalized example of clustering, a point P is a vector defined by data for the product class. A point Pi is a point in N-dimensional space that is defined by data including product class, category, outer material, filling, cover sole, outer sole, shape, heel height, color, size, style, etc. For this example, the point Pi is a point that belongs to product class of “footwear.” A point Pk is an N-dimensional vector that is defined by data including product class, category, outer material, filling, color, size, style, etc. The point Pk belongs to product class of ‘coats.’
For the particular point Pi in N-dimensional space, product clustering 48 determines 48a whether that point Pi is close to another point Pi+1 of the same product class, by determining the distance between those points as X=Pi+1−Pi in the N-dimensional space and compares 48b that distance X to a threshold value Tfootwear. For the particular point Pk in N-dimensional space, product clustering 48 determines whether that point Pk is close to another point Pk+1 of the same product class, i.e., is not part of Pi+1, by determining the distance between those points as Y=Pk+1−Pl in the N-dimensional space and compares that distance Y to a threshold value Tcoat.
Determining the distance between those points uses description of products, e.g., web attributes to link to categories, price, materials, etc., as metrics for determining the distance and hence strength of a relationship between two points.
The product clustering 48 determines the distance X (here in two-dimensional space for illustration, but in practice, n-dimensional space). Each dimension in n-dimensional space corresponds to a dimension of the vector P and each of the n-dimensions corresponds to a metric, such as web attributes that link to categories, price, materials, etc., between all of the points, and groups them into the clusters, provided that the distance X is less than or equal to the threshold value, e.g., Tfootwear or threshold value Tcoat and that the product class of each point is the same.
As an example, the product clustering 48 determines 48a the distance X between a point Pi+1 and any point in each existing cluster 48c, compares 48b that distance X to the threshold T and determines whether the point Pi+1 belongs in the existing cluster 48c or whether the point Pi+1 belongs in a new cluster 48d.
The product clustering 48 determines 48e whether there are more points. If so, the product clustering 48 retrieves 48f the next point and continues processing, as shown. On the other hand, if there are not any more points to cluster, the process may find 48g a centroid for each determined cluster. Finding a centroid involves finding a point that best represents the cluster, e.g., is at the center of the cluster or which is clustered around the predominant number of points in the cluster, using the K-nearest neighbors algorithm, as mentioned above.
Thus, the product clustering 48 groups points into clusters and from the cluster determines a centroid that is used to represent the points and all possible points in the cluster. Each cluster has associated with the cluster an identification of the product class. The centroid “D,” is the point P in N-dimensional space which, along with a determined tolerance, variance, or standard deviation, represents that particular cluster. These data are used to classify new products into clusters according to existing products and then these new products can be assigned markdowns. The product clustering 48 has non-missing, positive values for all three components of the demand forecasting to predict product demand and, thus, to assign markdowns. The product clustering 48 can check if there are more product classes 48i. If there are more product classes, the product clustering 48 returns to 48a and determines the distance for the next class. Each determined cluster is assigned a current discount.
Referring now to
While, the point 104 is close to the cluster of the class “footwear,” it actually belongs to the class of “coats” but is not included in either cluster “footwear” or “coats” since the distance between the point 104 and the nearest point in any of the clusters of “coats” is beyond the threshold “Tcoat” for “coats’” Similarly, the point 102 is beyond the threshold “Tfootwear” for “footwear” and is not included in any cluster. Both point 102 and point 104 are considered outliers or more correctly, noise in the data, and may be ignored or grouped manually. There can be another requirement for forming clusters, which is that the cluster has a minimum number of members. Generally, that number is determined empirically.
In addition, after processing of all points in the class, there may be some points that do not fit into any class. These outliers can be manually discounted.
Variations in the grouping are possible. For instance, the process has been described as involving determinations of clusters for each class of objects, sequentially. Thus, as described, a first class of objects are processed, clustered and represented as a centroid and a second class of objects are processed, clustered and represented as a centroid, and so forth. This need not be the case and instead objects from different classes can be processed and clustered, and the clusters can be represented as centroids that are identified by the class.
Picklist Selection
Picklist selection 49 determines the articles to be included in the markdown. Forecast inventory at end of an end of season sale (EOSS) for all seasonal and non-seasonal articles if no intervention made (under the assumption that the articles remain in current discount cluster). Prioritize SKUs with high inventory and high expected responsiveness to discounts. Work with business to finalize SKUs and compare with control stores. SKU is short for “stock keeping unit.” A SKU is generally an alphanumeric number that retailers assign to products to keep track of stock levels. If a product has different colors and sizes, each variation is assigned a unique SKU number. The granularity is defined by each client system.
Demand Forecast
Referring now to
Baseline
The baseline is the scale-setting part of the predictive model. Baseline derives from the idea that the single best predictor of an item's sales tomorrow is how many that item sold today. The intuitive interpretation of the baseline is how many units you would expect to sell of that item in that store on a typical day of sales at the typical price. In broad terms, the baseline is an estimate of the average daily sales of an item over some appropriate range of the recent past. Almost all the rest of the module handles edge cases, such as the following:
Baseline corrects for items that have not recently sold at the typical price. Removes special dates (like bank holidays or special promotions dates) based on user input in the special dates data. Removes stock-out dates based on user input in the stock-out table. The markdown process 40 is to be used with products that have been sold during the full season.
If a client customer adds new or relatively new products to a markdown process, baseline computation is likely to fail without a mechanism to handle new products. That is, baseline needs to run, at least, every time the execution date is changed. However, see product clustering 48, discussed above, that is used in the markdown process 40 to output a markdown strategy even for new products with no sales history.
Uplift
Uplift describes the price elasticity of an item: how much its demand changes as its price is raised or lowered. Uplift is assumed to have a multiplicative effect. For example, a certain change in price may raise demand for a product by 20% over its baseline value rather than increasing it by, say, 5 units per day. As a result, uplift is a dimensionless number between 0 and infinity, with a value of 1 indicating no particular uplift over the baseline demand. The uplift is essentially defined to be 1 at the typical discount at which the baseline is calibrated. Uplift is computed by assessing what are the growths in unit sold, depending on price change. This historical record is fitted to a mathematical curve (typically exponential). Uplift is computed using previous markdown seasons. The uplift module need only be executed once for each campaign.
Boost
Boost is the term for all of the non-price drivers of sales: typically, all of the periodic or event-driven events that may hinder or help the sales in a store. Examples of these include: weekly sales trends that may show that weekend traffic (and thus sales) is 20% higher than what is seen during weekdays, annual patterns that may show an overall drop in sales volumes during August but more sales in December. Other examples include holidays that might show elevated sales in the days leading into Christmas, low sales on Christmas itself, and extremely high sales on the two days after Christmas. Other examples include events like voucher promos or center-wide events that also drive sales. These sales can be modeled using Autoregressive integrated moving average (ARIMAX) framework, (see https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) where the “I” term allows for the possibility of long-term, organic growth of the business, the “X” terms allow us to specify external drivers (like holidays and other special events) that may occur at any time of the year, the “AR” and “MA” terms both account for the weekly/seasonal patterns as well as the multi-day effects that certain holidays and events may cause.
Regression factors can be added by providing a list in features. The user needs to prepare data in the training and prediction data at the aggregation level for boost. It is essential that the regression factor is available in the prediction time frame, so a regression factor for which future values are not available or cannot be estimated should not be used as a regression factor.
Demand forecast process 50 has the data processing system 10 determine 50a all discount options for all products at any desired level of aggregation. By all discount options is meant that the demand forecast process 50 rather than calculating explicitly all discount paths, the demand forecast process 50 calculates all possible discount steps (e.g., 10%, 20%, . . . ) at each update period. An update period is a discretization of the time on the moments that prices are allowed to change. Thus, the number of options calculated corresponds to the number of discount steps times the number of update periods.
As opposed, if a process was calculating full discount paths the process would need a number of discount options raised to the number of update periods) which is significantly large. Demand forecast process 50 has the data processing system 10 potentially, allows a different discount path for each SKU-Store. Demand forecast process 50 has the data processing system 10 use 50b a three-component multiplicative approach. Demand forecast process 50 has the data processing system 10 determine 50c a baseline of average sales on a normal trading day. Demand forecast process 50 has the data processing system 10 determine 50d the prediction of the boost in extra sales due to non-discount factors, such as relevant calendar dates, weekends, or other factors such as marketing campaigns. Demand forecast process 50 has the data processing system 10 determine 50e the prediction of the uplift in sales due to discount factors. Baseline defines the level of the demand and serves as a scaling factor to adapt the forecast to the recent past. Boost models non-price drivers, tracking seasonality of products and defining the overall shape of sales in time. Uplift models the contribution in demand of price drivers, allowing to simulate incremental sales at different discount points.
Scenario Generation
Scenario generation is optional and involves that within set business boundaries, all permutations of discounts and pricing paths are assessed. Within business boundaries revenue or cash margin uplift based on business goals, and all possible store-item group price paths are generated. The markdown process 40 does not need an explicit calculation of all scenarios (i.e., price paths). The optimization process 54 discussed below can find an optimal price path, without need to explicitly calculate all alternatives. However, a scenario generation module may be used to show client customers why other alternative scenarios were inferior to the optimal path chosen path.
Optimization
Referring now to
The optimize process 54 receives 54a from the demand forecast process 50 sends proposed discounts. The optimize process 54 optimizes the proposed discounts by solving 54b a mixed integer programming mathematical optimization. The objective of the optimization is to maximize benefit during the markdown campaign. This means, that the optimal discounts can be written as:
Where s is the selected store, i is the selected item and p is the selected discount period. K represents the business parameter to maximize. Given that this problem is discrete (the price selected to each of the products is not a continuous variable), a MIP solver is used to solve Eq. 1. This problem can be proved to be non-linear. Given this property, the MIP solver is needed to modify the formulation.
Where B is a binary variable and d are all possible discounts. With this formulation, to the MIP solver generates all possible combinations and then restricts the Binary variable to be 1 in the selected scenario and 0 in those that are not optimal. This way the formulation becomes linear. The initial problem size is the product of the number of stores, number of items, number of periods and number of possible discounts.
For additional information on mixed integer programming please see “https://www.gurobi.com/resource/mip-basics/.’ The mixed integer programming mathematical optimization defines 54c a target function to maximize (typically margin or margin penalized by leftover stock but can be tailored to clients' needs) and define a set of constraints 54d that ensure that the solution found by the optimizer is applicable. Some constraints are purely instrumental to define the markdown problem (such as enforcing non-increasing prices over time), while others allow business users to input their strategy to satisfy some business or operational requirements (such as imposing a limit on the number of re-tags on a given week).
Once promotions go live in the stores, the optimizer monitors 54e performance of the optimized markdown plan. When the optimizer determines that there is a deviation in the monitored performance vs the un-optimized version of the markdown campaign plan the deviation is fed to the react process 58, which attributes the deviation to each of the components of the demand forecast, resulting in updated baseline, uplift, and boost values for each SKU (or any level of aggregation defined).
Optimization Constraints
The following constraints are commonly used during optimization. Selected price should be above cost: selects a price paths where all discounted prices are above cost. The result will be infeasible if there are not available discounts that allow the price to be below cost. Stores should share the same price: all stores belonging to the same store group should have the same prices for the same items. Price monotony: prices can only go down (or maintain) during the markdown campaign, per product. Minimum and maximum sell through: select a percentage of the purchased stock that should remain (or should be sold) during the markdown campaign. Minimum and maximum average discount: compute the average of all discounts and forces it to be above or below a certain value. Minimum discount increase: configure a minimum step increase of discounts, by product, for each of the update periods. Maximum number of retags: configure a maximum number of item retags applied to all stores for each of the update periods.
React
Referring now to
The react engine 13 continuously learns and improves the discounts. A react process 58 (
Referring now to
The react process 58 has the data processing system 10 analyze 58e the current sales of each SKU of the set of SKU's against the current discount path for each SKU of the set of SKU's and determines 58e whether each SKU is satisfying the received goals 55a and constraints 55b. The react process 58 determines 58f whether each SKU of the set of SKU's satisfies the received goals 55a and constraints 55b. When the react process 58 has the data processing system 10 determine that the received goals 55a and constraints are satisfied, the react process 58 continues to periodically receive the data regarding new sales of the set of SKU's, updated inventory levels for each SKU of the set of SKU's, and the current discount path for the set of SKU's. When the react process 58 has the data processing system 10 determine that the received goals 55a and constraints are not satisfied, the react process 58 re-optimizes 58i by re-executing the optimize process 54 for the discount path for the set of SKU's by determining a new, updated discount path for the set of SKU's.
Thus, once markdown campaign start, the react process 58 observes real sales for those SKUs and uses a calculated deviation to improve future decisions (markdowns). The react process 58 calculates a controlled deviation ratio based on recent data and calculate deviation contribution to baseline and uplift. The react process 58 uses contributions to scale and shape next recommendations.
Allocation of the contributions is done by splitting between non-price and price drivers of the demand forecasting model. At a given discount level, forecast can be split by two levers non-price drivers that cover baseline plus boost components of the markdown process 40 (sometimes summarized as “baseline,”), reflecting what would have sold without any discount at this given date and price drivers, which covers uplift in the markdown process 40, reflecting how much of the sales were expected to come by the selected discount.
Baseline share=% of the sales coming from the non-price drivers at observed % discount.
Uplift share=% of the sales coming from the price drivers at observed % discount
Baseline contribution=Bias×Baseline share
Uplift contribution=Bias×Uplift share
Bias is an observed bias based on a lookback period. Two parameters can control the speed or reaction to the observed bias.
Lookback period, which is the period that is chosen to measure the bias between locked plan in the system and actual sales figures.
Confidence value in lookback period, which is a percentage of the observed bias to be corrected using react process 58.
Output
The output includes recommend discounts, in a format that can be plugged directly to a client's pricing system and business parameters such as unit sales, revenue, margin projections. So, a business can compare several runs of the tool and select the strategy more appropriate to the business' needs.
The markdown process 40 also includes promotion fatigue processing 41 that accounts for a decline in demand when products are on promotion for extended periods of time. The described approach predicts full discount path for all SKU's at any desired level of aggregation (potentially, allowing a different discount path for each SKU—Store combo). This allows a customer to plan a markdown strategy in advance, for all retags, instead of focusing on a single markdown/promotion event.
Referring now to
The distributed computing environment 150 includes data centers that includes cloud computing platform 152, rack 154, and node 156 (e.g., computing devices, processing units, or blades) in rack 154. The technical solution environment can be implemented with cloud computing platform 152 that runs cloud services across different data centers and geographic regions. Cloud computing platform 152 can implement fabric controller 158 component for provisioning and managing resource allocation, deployment, upgrade, and management of cloud services. Typically, a cloud computing platform 152 acts to store data or data analytics applications in a distributed manner. Cloud computing platform 152 in a data center can be configured to host and support operation of endpoints of a particular service application. Cloud computing platform 152 may be a public cloud, a private cloud, or a dedicated cloud.
Node 156 can be provisioned with host 160 (e.g., operating system or runtime environment) execution a defined software stack on node 156. Node 156 can also be configured to perform specialized functionality (e.g., compute nodes or storage nodes) within cloud computing platform 152. Node 156 is allocated to run one or more portions of a service application of a tenant. A tenant can refer to a customer utilizing resources of cloud computing platform 152. Service application components of cloud computing platform 152 that support a particular tenant can be referred to as a tenant infrastructure or tenancy. The terms service application, application, or service are used interchangeably herein and broadly refer to any software, or portions of software, that run on top of, or access storage and compute device locations within, a datacenter.
When more than one separate service application is being supported by nodes 156, nodes 156 may be partitioned into virtual machines (e.g., virtual machine 162 and virtual machine 164). Physical machines can also concurrently run separate service applications. The virtual machines or physical machines can be configured as individualized computing environments that are supported by resources 166 (e.g., hardware resources and software resources) in cloud computing platform 152. It is contemplated that resources can be configured for specific service applications. Further, each service application may be divided into functional portions such that each functional portion is able to run on a separate virtual machine. In cloud computing platform 152, multiple servers may be used to run data analytics applications and perform data storage operations in a cluster. In particular, the servers may perform data operations independently but exposed as a single device referred to as a cluster. Each server in the cluster can be implemented as a node.
Client device 170 may be linked to a service application in cloud computing platform 152. Client device 170 may be any type of computing device, which may correspond to computing device 180 described with reference to
Referring to
The features of the markdown improve the functioning of the distributed computing environment and/or the computing device (or computer or data processing system, etc.) by providing the markdown and lifecycle management tool that enables the computing device to maximize margins for every merchant product including new products.
Embodiments can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Embodiments can be implemented in a computer program product tangibly stored in a machine-readable (e.g., non-transitory computer readable) hardware storage device for execution by a programmable processor; and method actions can be performed by a programmable processor executing a program of executable computer code (executable computer instructions) to perform functions of the invention by operating on input data and generating output. Embodiments can be implemented advantageously in one or more computer programs executable on a programmable system, such as a data processing system that includes at least one programmable processor coupled to receive data and executable computer code from, and to transmit data and executable computer code to, memory, and a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language.
Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive executable computer code (executable computer instructions) and data from memory, e.g., a read-only memory and/or a random-access memory and/or other hardware storage devices. Generally, a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Hardware storage devices suitable for tangibly storing computer program executable computer code and data include all forms of volatile memory, e.g., semiconductor random access memory (RAM), all forms of non-volatile memory including, by way of example, semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
A number of embodiments of the invention have been described. The embodiments can be put to various uses, such as educational, job performance enhancement, e.g., sales force and so forth. Nevertheless, it will be understood that various modifications may be made without departing from the scope of the invention.