Artificial Intelligence Based Room Personalized Demand Model

FIELD

One embodiment is directed generally to a computer system, and in particular to a computer system that generates an artificial intelligence based room personalized demand model.

BACKGROUND INFORMATION

Increased competition in the hotel industry has caused hoteliers to look for more innovative revenue management policies, such as personalized pricing and recommendations. Over the past few years, hoteliers have come to understand that not all guests are equal and a traditional one-size-fits-all policy might prove to be ineffective. Therefore, a need exists for hotels to profile their guests and offer them the right product/service at the right price with the goal of maximizing their profit.

SUMMARY

Embodiments model demand and pricing for hotel rooms. Embodiments receive historical data regarding a plurality of previous guests, the historical data including a plurality of attributes including guest attributes, travel attributes and external factors attributes. Embodiments generate a plurality of distinct clusters based the plurality of attributes using machine learning soft clustering and segment each of the previous guests into one or more of the distinct clusters. Embodiments build a model for each of the distinct clusters, the model predicting a probability of a guest selecting a certain room category and including a plurality of variables corresponding to the attributes. Embodiments eliminate insignificant variables of the models and estimate model parameters of the models, the model parameters including coefficients corresponding to the variables. Embodiments determine optimal pricing of the hotel rooms using the model parameters and a personalized pricing algorithm.

BRIEF DESCRIPTION OF THE DRAWINGS

Further embodiments, details, advantages, and modifications will become apparent from the following detailed description of the embodiments, which is to be taken in conjunction with the accompanying drawings.

FIG. 1 is a block diagram of a computer server/system in accordance with an embodiment of the present invention.

FIG. 2 is a flow diagram that illustrates the functionality of the room demand model module of FIG. 1 in accordance to embodiments.

FIG. 3 illustrates an example of clustering based on attributes in accordance to one embodiment.

FIG. 4 illustrates a further example of clustering using random forest machine learning for correlation in accordance with one embodiment.

FIG. 5 illustrates an example of applying a mixture MNL model to each cluster to estimate the demand in accordance to one embodiment.

FIG. 6 illustrates the complete personalized demand model in accordance with one embodiment.

FIG. 7 illustrates an example of a likelihood function in accordance to one embodiment.

FIG. 8 illustrates a variable selection algorithm in accordance to an embodiment.

FIGS. 9 and 10 illustrate the results of embodiments of the invention using an experimental dataset to predict the possibility of a particular guest selecting a certain room category.

FIG. 11 illustrates the results of an experimental study in accordance to embodiments of the invention.

DETAILED DESCRIPTION

Embodiments utilize artificial intelligence (“AI”) to predict demand for multiple hotel room categories based on the individual attributes of the hotel guests, their booking channels, and room category features, including the offered price. Embodiments further estimate the fraction of the “no-purchase guests”, or the number of the guests who decide not to book the hotel rooms, which is an unobservable variable. Embodiments output the probability of each individual guest to book a room in a specific room category. Embodiments further estimate the relative monetary value of the room features for each cluster of the hotel guests. An example of the room feature could be the type of the bed (e.g., king vs. queen), view (e.g., ocean or garden), size of room, or type of room (e.g., suite vs. single room). To generate a personalized demand model based on guest characteristics as well as room features, embodiments use a combination of clustering and a mixture of the multinomial choice modeling.

Traditional revenue management (“RM”) practices in the hotel industry use capacity control mechanisms, specifically controlling room availabilities for different categories of products, typically using length-of-stay controls. In general, the hotel industry does not use advanced demand models based on the individual attributes of the hotel guests, their booking channels and room category features. However, operating conditions have significantly changed for the hotel industry in recent years. Given the transparency of room prices via the Internet, corporate travel management companies, leisure travel agencies, and brand websites moved to a common distribution platform and started reaching into each other's customer bases. Search engines then drove this transparency even further, aggregating the online rates from all distribution channels into a single interface and showed price as one of the most prominent differentiators between hotel rooms.

In this competitive environment, traditional RM solutions, which operate under the assumption that the demand for a product does not depend on what other choices are available, are much less effective in segmenting guests with well-fenced restrictions. Therefore, there is a need for hotels to move towards price optimization solutions based on guests' willingness-to-pay and price elasticity.

Especially for the online sales, the personalized demand modeling and price optimization have seen relatively little use in the hotel industry partially due to the difficulty of directly applying these methods to the hotel booking. Most of the demand-forecasting tools currently used by the hotel industry are aimed at providing the overall number of bookings based on time series analysis, thus ignoring demand price elasticity and room category features. These demand modeling tools are often ineffective in the presence of the heterogeneous guests with significantly different willingness-to-pay.

In contrast to known solutions, embodiments implement a personalized strategy by first dividing the guest base into distinct clusters by applying a machine learning-based soft clustering model based on the guest, travel, and external attributes. Known solutions often accomplished this clustering based on only easily separate guests such as the trip purpose (e.g., leisure or business) given the assumption of homogeneous guests. This may be too restrictive to apply in practice since guests have their own characteristics which require different choice models. Even for some guests with similar attributes, their choice probabilities may depend on external attributes such as local events, holidays and the weather at the origin and the destination. Therefore, embodiments relax the strong assumption of homogeneity of guests in the choice modeling.

Embodiments include two prior sequential steps of arrival and booking decision steps. A customer can arrive (or not) in a hotel room booking system. If arrived, the customer then decides to make a reservation (or not) at the hotel. Once they have arrived at the booking system and decided to reserve a room, they would choose a room type. However, in general, observable data is available only for the customers who purchased any product and if embodiments merely fitted the demand model to the observable data, it may lead to a biased estimation and not incorporate price sensitivity appropriately. To avoid these possible biases, embodiments incorporate the no-purchase cases where customers may not arrive into the booking system because they are not interested in the hotel or customers arrived at the booking system but then leave without a purchase due to high price or the lack of available rooms. Therefore, embodiments can account for the no-purchase cases and competitors (or outside options), which may affect a customer's initial decision as compared to the previous industry solutions where they do not consider those factors.

Embodiments cluster the guests into several groups, or clusters, where the guests with similar attributes are assigned to the same cluster. Moreover, embodiments implement a soft clustering approach by allowing each guest to belong to multiple clusters with certain probabilities. Embodiments then build a multinomial choice model for each cluster, which predicts the probability of selecting a certain room category by each particular guest. Embodiments determine the number of groups using a data-driven cross-validation approach to determine the optimal number of clusters.

Since the number of attributes is generally very large, the data within each group may be sparse, leading to inaccurate predictions. In order to mitigate this, embodiments implement a “Lasso” regularization method to set the coefficients for the least important model covariates to zero by maximizing the penalized likelihood function of the mixture multinomial choice model.

In order to estimate the parameters (i.e., arrival rates, the probabilities of belonging to each group and each covariates parameters), embodiments use the Expectation-Maximization (“EM”) algorithm after performing random forest-based soft clustering to find the initial clustering probabilities. Because of the two unobservable factors (i.e., no-purchase process and cluster process), embodiments account for those latent factors. Finally, the parameters extracted from the above are plugged into the personalized pricing algorithm for determining the optimal price of each room type for each guest.

Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be apparent to one of ordinary skill in the art that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments. Wherever possible, like reference numbers will be used for like elements.

FIG. 1 is a block diagram of a computer server/system 10 in accordance with an embodiment of the present invention. Although shown as a single system, the functionality of system 10 can be implemented as a distributed system. Further, the functionality disclosed herein can be implemented on separate servers or devices that may be coupled together over a network. Further, one or more components of system 10 may not be included. For example, when implemented as a web server or cloud based functionality, system 10 is implemented as one or more servers, and user interfaces such as displays, mouse, etc. are not needed.

System 10 includes a bus 12 or other communication mechanism for communicating information, and a processor 22 coupled to bus 12 for processing information. Processor 22 may be any type of general or specific purpose processor. System 10 further includes a memory 14 for storing information and instructions to be executed by processor 22. Memory 14 can be comprised of any combination of random access memory (“RAM”), read only memory (“ROM”), static storage such as a magnetic or optical disk, or any other type of computer readable media. System 10 further includes a communication device 20, such as a network interface card, to provide access to a network. Therefore, a user may interface with system 10 directly, or remotely through a network, or any other method.

Computer readable media may be any available media that can be accessed by processor 22 and includes both volatile and nonvolatile media, removable and non-removable media, and communication media. Communication media may include computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and includes any information delivery media.

Processor 22 is further coupled via bus 12 to a display 24, such as a Liquid Crystal Display (“LCD”). A keyboard 26 and a cursor control device 28, such as a computer mouse, are further coupled to bus 12 to enable a user to interface with system 10.

In one embodiment, memory 14 stores software modules that provide functionality when executed by processor 22. The modules include an operating system 15 that provides operating system functionality for system 10. The modules further include room demand model module 16 that generates a room demand model to maximize hotel room revenue, and all other functionality disclosed herein. System 10 can be part of a larger system. Therefore, system 10 can include one or more additional functional modules 18 to include the additional functionality, such as the functionality of a Property Management System (“PMS”) (e.g., the “Oracle Hospitality OPERA Property” or the “Oracle Hospitality OPERA Cloud Services”) or an enterprise resource planning (“ERP”) system. A database 17 is coupled to bus 12 to provide centralized storage for modules 16 and 18 and store guest data, hotel data, transactional data, etc. In one embodiment, database 17 is a relational database management system (“RDBMS”) that can use Structured Query Language (“SQL”) to manage the stored data. In one embodiment, a specialized point of sale (“POS”) terminal 99 generates transactional data and historical sales data (e.g., data concerning transactions of hotel guests/customers) used for performing the optimization. POS terminal 99 itself can include additional processing functionality to perform room assignment optimization in accordance with one embodiment and can operate as a specialized room assignment optimization system either by itself or in conjunction with other components of FIG. 1.

In one embodiment, particularly when there are a large number of hotel locations, a large number of guests, and a large amount of historical data, database 17 is implemented as an in-memory database (“IMDB”). An IMDB is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Main memory databases are faster than disk-optimized databases because disk access is slower than memory access, the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.

In one embodiment, database 17, when implemented as a IMDB, is implemented based on a distributed data grid. A distributed data grid is a system in which a collection of computer servers work together in one or more clusters to manage information and related operations, such as computations, within a distributed or clustered environment. A distributed data grid can be used to manage application objects and data that are shared across the servers. A distributed data grid provides low response time, high throughput, predictable scalability, continuous availability, and information reliability. In particular examples, distributed data grids, such as, e.g., the “Oracle Coherence” data grid from Oracle Corp., store information in-memory to achieve higher performance, and employ redundancy in keeping copies of that information synchronized across multiple servers, thus ensuring resiliency of the system and continued availability of the data in the event of failure of a server.

In one embodiment, system 10 is a computing/data processing system including an application or collection of distributed applications for enterprise organizations, and may also implement logistics, manufacturing, and inventory management functionality. The applications and computing system 10 may be configured to operate with or be implemented as a cloud-based networking system, a software-as-a-service (“SaaS”) architecture, or other type of computing solution.

FIG. 2 is a flow diagram that illustrates the functionality of room demand model module 16 of FIG. 1 in accordance to embodiments. In one embodiment, the functionality of the flow diagram of FIG. 2 is implemented by software stored in memory or other computer readable or tangible medium, and executed by a processor. In other embodiments, the functionality may be performed by hardware (e.g., through the use of an application specific integrated circuit (“ASIC”), a programmable gate array (“PGA”), a field programmable gate array (“FPGA”), etc.), or any combination of hardware and software.

In general, the functionality of FIG. 2 models personalized demand models based on guest's characteristics as well as room features using an approach that includes a combination of clustering and a mixture of multinomial choice modeling. At 202, historical reservation data and guest information is received from an input dataset/database 17. In one embodiment, input dataset 17 is an “OPERA” database from Oracle Corp. and includes details on guests of a single hotel or a group of related hotels such as a chain of hotels as well as available rooms. In other embodiments, a database of data regarding guests and rooms for any type of PMS can be used. In embodiments, input dataset 17 is received via electronic communications from a computing device under the control of the hotel operator and is then parsed by system 10 to extract the information needed for the subsequent functionality disclosed below.

Since the no-arrival and no-booking customers are not recorded in database 17, they are treated as latent or unobserved variables. As disclosed in more detail below, these latent variables are estimated using an Expectation-Maximum (“EM”) algorithm, which iteratively fits the demand model to find the most likely estimate for the rate of all customers including the no-arrival and no-booking customers.

At 204, embodiments cluster guests using machine learning methods (i.e., soft clustering).

To implement a personalized strategy, embodiments first divide the guest base into distinct clusters by applying a machine learning-based soft clustering model based on the guest, travel, and external attributes. Known solutions typically accomplish this clustering based on only easily separable guest attributes, such as the trip purpose (e.g., leisure vs. business) given the assumption of homogeneous guests. This may be too restrictive to apply in practice since guests have their own characteristics which require different choice models. Even for some guests with similar attributes, their choice probabilities may depend on external attributes such as local events, holidays, and weather at origin and destination. Therefore, embodiments relax the strong assumption of homogeneity of guests in the choice modeling.

Further, embodiments add two prior sequential steps including arrival and booking decision steps. Customers arrive (or not) in the booking system. If arrived, then they decide whether making a reservation (or not) in the hotel. Once they arrived at the booking system and decide to reserve a room, they would choose a room type.

However, the data is available only for the customers who purchased any product and if the demand model is only fitted to the observable data, it may lead to a biased estimation and not incorporate price sensitivity appropriately. To avoid these possible biases, embodiments incorporate the no-purchase cases where customers may not arrive into the booking system because they are not interested in the hotel or customers arrived in the booking system, but they would leave without purchase due to high price or lack of available rooms. Therefore, embodiments can account for the no-purchase case and competitors (or outside options), which may affect a customer's initial decision, as compared to known solutions that do not consider those factors.

At 206, embodiments perform choice modeling that develops a mixture multinomial logit model (“MNL”) model to estimate the demand. A multinomial choice model is built for each cluster of 204, which predicts the probability of selecting a certain room category by each particular guest. Embodiments determine the number of groups using a data-driven cross-validation approach to determine the optimal number of clusters.

At 208, embodiments perform variable selection by eliminating insignificant variables using a Lasso regularization method. Since the number of attributes is usually fairly large, the data within each group may be sparse, leading to inaccurate prediction. In order to mitigate this, the Lasso regularization method sets the coefficients for the least important model covariates to zero by maximizing the penalized likelihood function of the mixture multinomial choice model.

At 210, embodiments estimate model parameters using the Expectation-Maximum (“EM”) algorithm. In order to estimate the parameters (i.e., arrival rates, the probabilities of belonging to each cluster group and each covariates parameters), embodiments use the EM algorithm after performing random forest-based soft clustering to find the initial clustering probabilities. Embodiments assume a parametric model to predict the demand. Generally speaking, a parametric model is a family of a probability distribution that has a finite number of parameters that determine the characteristics of the distribution. The parameters of the model are estimated based on the data to find the values of the parameters that provides the minimal deviation from the observed data. In embodiments, the model has three sets of parameters. First, the probabilities of belonging to each cluster group is estimated by performing random forest-based soft clustering. Next, the arrival rates and booking choice parameters are estimated (i.e., the probability of arriving into the booking system and the booking choice probability (if customers arrived)). Finally, each attribute parameters are estimated, such as guest attributes, travel attributes and external factors. Because embodiments include two unobservable factors (i.e., no-purchase process and cluster process), embodiments account for those latent factors.

At 212, embodiments generate a personalized pricing policy algorithm to maximize hotel revenue. The parameters extracted from the above functionality is plugged into a personalized pricing algorithm to determine the optimal price of each room type for each guest. Further, embodiments can use the model to predict the possibility of a particular guest selecting a certain room category.

In addition of the functionality of FIG. 2, embodiments use the determined optimal pricing to store and update databases that provide prices to online services. These updates can be frequent (e.g., multiple times a day or hour) and cause electronic devices to be automatically modified based on modified prices. Further, embodiments may cause hotels to be more fully utilized, thus resulting in additional services being used in the hotels. Further, embodiments cause the optimized prices to be sent over a network which causes other computing devices/servers to modify prices in a pricing database according to the revised optimized prices.

Personalized Demand Model

Embodiments consider K-types of hotel rooms with K different prices. The outcome variable y, as a choice of room purchased, takes a value from 1, . . . , K. The demand for the hotel rooms can vary across the individual attributes of the hotel customers, their booking channels and room category features. x denotes all of the features affecting the choice of a hotel room. The personalized demand model is the outcome y given x.

One challenging issue is that data is only available for observed purchases of the hotel rooms. If the no purchase cases are ignored and the demand model is only based on the purchased cases, it leads to biases by underestimating price sensitivity. Some customers might decide to no purchase because of higher price than their willingness to pay. To avoid such biases, embodiments model the customer arrival process by dividing a day into a small discrete time slices, denoted by t=1, . . . , T, during which at most one customer might arrive. Arrival process at time t is modeled as a Bernoulli distribution with the arrival probability denoted by λ. Given an arrival, it is assumed that a customer makes a decision between booking and non-booking any hotel room based on the prices. A logistic regression model is considered for the booking process given the room prices. For the no purchase (no booking), proxy prices can be used such as average prices for each room a day.

Given booking after arrival, guests choose a room among K different rooms according to their own preference given any conditions. For example, the demand depends on guest attributes such as loyalty status, profile preferences, ancillary services, or external attributes such as local events, holidays and weather. To model such a personalized demand, embodiments first segment the guests into G clusters (204 of FIG. 2) based on the information x so that their demand patterns are homogeneous within each cluster but heterogeneous across clusters and then assume a multinomial logit model within each cluster separately (206 of FIG. 2). Since the cluster membership is unknown, a mixture multinomial logit model is assumed where the probabilities of belonging to each cluster are specified as parameters and then estimated from data. This is referred to as a “choice process.” The following personalized demand model incorporates the three sequential steps (“demand model steps”):

$\begin{matrix} Arrival : r_{t} ~ Bernoulli (λ) & (i) \\ Booking : \log \frac{B_{t}}{1 - B_{t}} = β_{0} + β_{1} {\tilde{p}}_{t}, for r_{t} = 1 & (ii) \\ Choice : \sum_{g = 1}^{G} π_{g} (x_{t}, p_{t}^{k}) \prod_{k = 1}^{K} {\frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}}^{I (y_{t} = k)}, for r_{t} = b_{t} = 1, & (iii) \end{matrix}$

where B_t=Pr (b_t=1|{tilde over (p)}_t, r_t=1) and {tilde over (p)}_tdenotes a summary statistic of the K room prices at time t such as average, minimum, maximum and etc. p_t^kis the k typed room price at time t and π_g(x_t, p_t)=Pr(z_t=g|x_t, p_t) denotes the probability of belonging the cluster g given x_tand p_t=(p_t1, . . . , p_tK)′, where z_tis a cluster indicator for a customer purchased at time t.

FIG. 3 illustrates an example of clustering based on attributes in accordance to one embodiment. The number of clusters up to “Cluster G” can be chosen based on the Bayesian Information Criterion (“BIC”). The number of clusters, (e.g., G), is unknown apriori, and therefore G needs to be selected based on the data. BIC is used to determine the number of clusters, and is a consistent and efficient criterion for choosing the number of mixture components under Gaussian distributional assumptions. The attributes can be divided into guest attributes, travel attributes and external factors.

FIG. 4 illustrates a further example of clustering using random forest machine learning for correlation in accordance with one embodiment. Correlation among the variables/attributes can be reduced by selecting parts of the variables. In embodiments, guest attributes, travel attributes and external factors are the variables determining the clustering process through the random forest. Random forest, in one embodiment, implements a repeated decision tree with Bootstrap sampling. 3 or 4 variables are randomly selected from 13 variables within each decision tree. The bootstrap sample size is 500.

Clustering is the process of partitioning data into subgroups so that the data points in each group are more similar to each other, according to some distance measure. Random forest for clustering uses an algorithm that generates a proximity matrix that gives a rough estimate of the distance between samples. Alternative methods for clustering can be used in other embodiments.

FIG. 5 illustrates an example of applying a mixture MNL model to each cluster to estimate the demand in accordance to one embodiment. Because the customer's demand patterns tend to be different across clusters, the choice model shown in FIG. 5 follows a MNL for each cluster separately (e.g., MNL 1 for Cluster 1, MNL 2 for Cluster 2, etc.).

When analyzing data, it is generally assumed that each observation comes from one specific distribution. However, in practice, assuming that each sample comes from the same distribution might be too restrictive. Often the data are complicated. For example, the data might be skewed-distributed or multimodal. Therefore, in embodiments, mixture models are used to describe such complicated probabilistic behavior of data. A mixture model assumes that each observation is generated from one of G mixture components and within each component, it assumes a specific distribution. In embodiments, the demand for different room types is of interest, which is defined as a categorical variable and modeled as the mixture of Multinomial Logistic (MNL) regression models.

FIG. 6 illustrates the complete personalized demand model in accordance with one embodiment. As shown, and as described by the demand model steps above, the model incorporates arrival, booking and room choice. Embodiments can observe only customers that have booked the hotel rooms. The other customers in the market have either never entered the system (no-arrival) or did not book the rooms (no-booking). These unobserved customers are described by what is generally called latent (or unobserved) variables. Statistical methods allow the estimation of these variables by fitting the distribution of the observed variables. Further, the statistical approach used by embodiments allows for distinguishing between no-arrival and no-booking customers.

Specifically, for each time slot t with no booking customers denoted by indicator variable b_t=0, it is not known whether arrival indicator variable r_tis 1 or 0. Since for those time slots, r_tis a latent variable, embodiments use the EM algorithm to estimate the model parameters. Here, the EM algorithm is an iterative method to find maximum likelihood estimates of parameters in statistical models that would most closely fit the observed variables.

Model Estimation of the Personalized Model

Embodiments perform model estimation of the personalized model (shown in FIG. 6) using two steps. First, an unsupervised clustering method such as random forest (shown in FIG. 5) is used to compute the probability of belonging to cluster g for each guest who arrived and booked at time t, π_g(x, p_t) (referred to as a “segmentation” step). For given the probabilities, embodiments find the maximum likelihood (“ML”) estimator of the model parameter θ={λ, β₀, β₁, δ_k^g, γ_k^g: k=1, . . . , K, g=1, . . . , G}. Since the arrival variable, γ_t, is unobservable for no purchase cases (i.e., b_t=0) and the cluster membership variable, z_t, is latent, the EM algorithm is implemented to find the ML estimator. This is referred to as the “EM step.”

In connection with the EM algorithm, it is helpful to first consider the complete likelihood function when all the variables {γ_t, b_t, z_t: t=1, . . . , T} are observed, which is given by:

$L (θ) = \prod_{t = 1}^{T} λ^{I (γ_{t} = 1)} \times {\frac{\exp (β_{0} + β_{1} {\tilde{p}}_{t})}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}}^{I (γ_{t} = 1) I (b_{t} = 1)} \times {π_{g} (x_{t}, p_{t}^{k}) \frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}}^{I (γ_{t} = 1) I (b_{t} = 1) I (y_{t} = k) I (z_{t} = g)} \times {\frac{1}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}}^{I (γ_{t} = 1) I (b_{t} = 1)} {(1 - λ)}^{I (γ_{t} = 1)} .$

Then, the conditional expected log likelihood function given the observed data D={γ_t, b_t: t=1, . . . , T, b_t=1}, denoted by custom-character (θ)

$\overline{} (θ) = \sum_{t : b_{t} = 1} {I (r_{t} = 1) logλ + I (r_{t} = 1, b_{t} = 1) \log \frac{\exp (β_{0} + β_{1} {\tilde{p}}_{t})}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}} + \sum_{t : b_{t} = 1} {\sum_{g = 1}^{G} E {z_{t} = g | D} I (r_{t} = 1, b_{t} = 1, y_{t} = k} {{logπ}_{g} (x_{i}, p_{t}^{k}) + \log \frac{\exp (β_{0} + β_{1} {\tilde{p}}_{t})}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}}} + \sum_{t : b_{t} = 1} [E {I (r_{t} = 1 | D) I (b_{t} = 0) \log \frac{1}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})} + E {γ_{t} = 0 | D} \log (1 - λ)] .$

The maximizer is found by implementing the EM algorithm as follows: For t-th iteration, (E-step) for given t-th updated parameter, embodiments compute:

$E (r_{t} = 1 | D) = \frac{λ {1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}^{- 1}}{1 - λ {1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}^{- 1}} \overset{def}{=} α_{t}, {\tilde{π}}_{g} (x_{t}, p_{t}^{k}, y_{t}) = E (z_{t} = g | x_{t}, p_{t}^{k}, b_{t} = 1, r_{t} = 1, y_{t} = k) = \Pr (z_{t} = g | x_{t}, p_{t}^{k}, b_{t} = 1, r_{t} = 1, y_{t} = k) \propto \Pr (z_{t} = g | x_{t}, p_{t}^{k}, b_{t} = 1, r_{t} = 1) \times f (y_{t} = k | x_{t}, p_{t}^{k}, b_{t} = 1, r_{t} = 1, z_{t} = g) \propto π_{g} (x_{t}, p_{t}^{k}) \times \prod_{k = 1}^{K} {\frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}}^{I (y_{t} = k)}$

where Σ_g=1^G{tilde over (π)}_g(x_t, p_t^k, u_tt)=1 and E(r_t=0|D)=1−α_t.

(M-step). Obtain the (41)-th updated parameters as follows: compute

$λ^{t + 1} = \frac{\sum_{t = 1}^{T} I (b_{t} = 1) + I (b_{t} = 0) α_{t}}{T}$

and update (β₀^t+1, β₁^t+1) by solving the following equation with respect to (β₀, β₁).

$\sum_{t = 1}^{T} [I (b_{t} = 1) {1 - \frac{\exp (β_{0} + β_{1} {\tilde{p}}_{t})}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}} - I (b_{t} = 0) α_{t} \frac{\exp (β_{0} + β_{1} {\tilde{p}}_{t})}{1 + \exp (β_{0} + β_{1} {\tilde{p}}_{t})}] (1, {\tilde{p}}_{t}) = (0, 0) .$

To update (δ_k^g(t+1), γ_k^g(t+1)) solve the equation with respect to (δ_k^g, γ_k^g).

$\sum_{t : b_{t} = 1}^{T} I (r_{t} = 1) \sum_{g = 1}^{G} \sum_{k = 1}^{K} {\tilde{π}}_{g} (x_{t}, p_{t}^{k}, y_{t}) {I (y_{t} = k) - \frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \sum_{k = 2}^{K} \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}} (x_{t}, p_{t}^{k}) = (0, 0) .$

Then, repeat (E-step) and (M-step) until a criterion meets.

This estimation method implicitly assumes that the number of cluster G is known. Since G is unknown in practice, the best G is chosen for given data. In one embodiment, 10-fold cross validation is used and G is chosen minimizing the misclassification rate. BIC is also available. If G=1 is selected, then the proposed personalized demand function based on the mixture MNL model is a classical MNL model commonly used in practice. In other words, the classical MNL model is a special case of the above model.

FIG. 7 illustrates an example of a likelihood function in accordance to one embodiment. The likelihood function is used at 208 of FIG. 2 in eliminating insignificant variables. The likelihood function shown in FIG. 7 combines a Poisson arrival process (r_t), a binomial booking process (b_t) and a mixture multinomial logit purchase choice process (d_t) into a single model.

Variable Selection

Further in connection with 208 and the variable selection, FIG. 8 illustrates a variable selection algorithm in accordance to an embodiment. As shown in FIG. 8, a Lasso penalty function uses 10-fold cross validation to suppress insignificant parameters. The parameters are coefficients corresponding to each variable either observed or latent in the regression model, which are estimated to give the best fit for the model given the set of observations.

Embodiments specify K, which is a lasso penalty tuning parameter that enables to choose the best model. Note that (E-step) is the same as the E-step disclosed above because the penalized log-likelihood function is the conditional expected log-likelihood function with adding a function of the parameter |δ_kj^g|+|γ_k^g|, which is not a latent variable. (M-step) for |δ_kj^g|+|γ_k^g| needs to be modified due to the penalty function. After completing (E-step), a maximizer of the objective function in FIG. 8 is then determined. The only difference between the expected log-likelihood function disclosed above and function in FIG. 8 is the last term in the right side, which is the penalty function of |δ_kj^g|+|γ_k^g| for j=1, . . . , p, k=1, . . . , K and g=1, . . . , G. The (t+1)th updated parameters are the same as in the previous (M-step). Now, the maximizer of the expected log-likelihood function in FIG. 8 is determined with respect to |δ_kj^g|+|γ_k^g|. Equivalently, the maximizer of the following part of the objective function can be determined:

$\sum_{t : b_{t} = 1} {\sum_{g = 1}^{G} \sum_{k = 1}^{K} {\tilde{π}}_{g} (x_{t}, p_{t}^{k}, y_{t}) I (r_{t} = 1, b_{t} = 1, y_{t} = k) {\log \frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}}} - \sum_{g = 1}^{G} κ_{p} \sum_{k = 1}^{K} \sum_{j = 1}^{P} {| δ_{kj}^{g} | + | γ_{k}^{g} |} = \sum_{t : b_{t} = 1} I (r_{t} = 1) {\sum_{g = 1}^{G} \sum_{k = 1}^{K} {\tilde{π}}_{g} (x_{t}, p_{t}^{k}, y_{t}) I (y_{t} = k) (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g}) - \log {1 + \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}} - \sum_{g = 1}^{G} κ_{p} \sum_{k = 1}^{K} \sum_{j = 1}^{P} {| δ_{kj}^{g} | + | γ_{k}^{g} |} \overset{let}{=}  (| δ_{kj}^{g} | + | γ_{k}^{g} |) - \sum_{g = 1}^{G} κ_{p} \sum_{k = 1}^{K} \sum_{j = 1}^{P} {| δ_{kj}^{g} | + | γ_{k}^{g} |} .$

The Newton algorithm to find the maximizer under the multinomial logistic regression can be tedious, because of the vector nature of the response observations. To avoid these numerical complexities, embodiments use the coordinate descent algorithm disclosed in Friedman, J. et al., “Regularization paths for generalized linear models via coordinate descent”, Journal of Statistical Software, 33(1), 1 (2010), herein incorporated by reference.

Embodiments perform partial Newton steps by forming a partial quadratic approximation to the log-likelihood function custom-character (δ_kj^g+γ_k^g) defined as above, allowing only (δ_kj^g+γ_k^g) to vary for a single class at a time, for each k and g. The partial quadratic approximation can be shown to be given by

${ (δ_{kj}^{g} + γ_{k}^{g}) = - \frac{1}{2 B} \sum_{t : b_{t} = 1} I (γ_{t} = 1) w_{tk} {\tilde{π}}_{g} (x_{t}, p_{t}^{k}, y_{t}) (z_{tk} - x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g}))}^{2} + C ({\tilde{δ}}_{tk}^{g} + {\tilde{γ}}_{k}^{g}),$

where B is the number of the booking observations, C(·) is a constant function, and

$z_{tk} = x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g} + \frac{I (y_{t} = k) - {\tilde{p}}_{tk}^{g}}{{\tilde{p}}_{tk}^{g} (1 - {\tilde{p}}_{tk}^{g})}$

$w_{tk} = {\tilde{p}}_{tk}^{g} (1 - {\tilde{p}}_{tk}^{g}) π_{g} (x_{t})$

${\tilde{p}}_{tk}^{g} = \frac{\exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}{1 + \sum_{k = 2}^{K} \exp (x_{t}^{'} δ_{k}^{g} + p_{t}^{k} γ_{k}^{g})}$

In summary, embodiments update the (t+1)th, δ_kj^g+γ_k^gfor k=1, . . . , K and g=1, . . . , G in the (M-Step) as follows: obtain the estimates of |δ_kj^g|+γ_k^g| by repeating the nested loops: for the mth iteration and g=1, G, repeat the following iteration.

- (i) For k=2, . . . , K, compute:

$z_{tk}^{g (m + 1)} = x_{t}^{'} δ_{k}^{g (m)} + p_{t}^{k} γ_{k}^{g (m)} + \frac{I (y_{t} = k) - {\tilde{p}}_{tk}^{g (m)}}{{\tilde{p}}_{tk}^{g (m)} (1 - {\tilde{p}}_{tk}^{g (m)})} w_{tk}^{g (m + 1)} = {\tilde{p}}_{tk}^{g (m)} (1 - {\tilde{p}}_{tk}^{g (m)}) π_{g} (x_{t}) {\tilde{p}}_{tk}^{g (m)} = \frac{\exp (x_{t}^{'} δ_{k}^{g (m)} + p_{t}^{k} γ_{k}^{g (m)})}{1 + \sum_{k = 2}^{K} \exp (x_{t}^{'} δ_{k}^{g (m)} + p_{t}^{k} γ_{k}^{g (m)})}$

- (ii) For j=1, . . . , p, update:

$β_{kj}^{g (m + 1)} = \frac{S {\sum_{t = 1}^{T} w_{tk}^{g (m)} x_{tj} (z_{tk}^{g (m)} - {\tilde{z}}_{tk}^{g (m + 1)}), K_{p}}}{\sum_{t = 1}^{T} w_{tk}^{g (m)} x_{tj}^{2}}$

where {tilde over (z)}_tk^g(m+1)=δ_k0^g(m+1)+E_l<jx_tlδ_kl^g(m+1)+Σ_l>jx_tlδ_kl^g(m)and S(z, γ) is the soft-thresholding operator with value;

$\begin{matrix} S (z, γ) = z - γ if z > 0 and γ < | z | \\ = z + γ if z < 0 and γ < | z | \\ = 0 otherwise . \end{matrix}$

- (iii) Set k=k+1 and go to (i).
  
  The iteration is repeated until a convergence criterion meets.

The following table describes each variable and parameters in the model:

Notation
Description

Variable
Price for room-type k at time t
p_t^k
Key predictor

Covariates except room price
x_t
Auxiliary information

at time t

Purchased room type at time t
y_t
Response variable

of interest

Arrival variable
r_t
r_tis unobservable

for the time t

Booking variable
B_t

Parameter
Arrival probability
λ

Regression coefficients for
β₀, β₁

booking process

Regression coefficients for a
δ_k^g, γ_k^g

choice process within each

cluster g

In a regression structure as in the model described above in conjunction with FIG. 6, zero regression coefficient (=parameters) of a variable implies that the variable is deleted from the model. In this sense, suppressing insignificant parameters means that redundant variables' regression coefficients are zero and only the variables with non-zero regression coefficients remain. By suppressing insignificant parameters, embodiments can avoid what is called overfitting the problem. For example, consider a database of hotel reservation transactions that includes the room booked, the guest's information, and the date and time of reservation. It would be easy to construct a model that will fit the training set perfectly by using the date and time of reservation to predict the other attributes. However, this model will not generalize well enough to new data, because those past times will never occur again. The best predictive and fitted model would be where the validation error has its global minimum.

Moreover, many variables make the model complicated. Let p be the number of explanatory variables. The model in embodiments has 1 (arrival process)+2 (booking process)+(G−1)*(K−1)*(p+2), where p is the number of explanatory variables except the price. If there are 4 different room types and 3 clusters, then the number of parameters need to be estimated is 1+2+2*3*(p+2), which increases in p. As the number of parameters increases, the model complexity also increases and the prediction accuracy based on the complex model could get worse. Therefore, embodiments choose a simpler model by removing insignificant variables according to the parsimony principle.

In connection with 212, the following pricing policy algorithm can be used to determine personalized pricing:

$Revenue (P_{j}) = \sum_{\begin{matrix} j \in all room \\ categories \end{matrix}} p_{j} f_{j} (x, p)$

The personalized demand model (e.g., FIG. 6) can be used for developing personalized pricing policies to maximize the hotel revenue. The total revenue would change as a function of the price of each room type. In one embodiment, the price for one room type can be varied at a time and the total revenue can be plotted.

As an example of using the generated model to predict the possibility of a particular guest selecting a certain room category, consider an example that uses the following experimental dataset: (1) Downtown hotel in Sydney, Australia; (2) 2 years of booking data from January, 2012-January, 2014; (3) Three different room types ($$ Suite>$$ Deluxe>$$ Superior); (4) Two different room features: City View, Water View; (5) Number of total reservations: 2,503; (6) Average booking days in advance: 10.29 days; (7) Average length of stay: 1.84 days.

Using the above dataset, the best model was: # of Clusters (G)=2 has the lowest BIC. A single MNL was used as a benchmark, which did not consider the no-purchase case or clustering. 70% of the data was used for training, and 30% was used for testing. The following performance measure was used:

$Performance measure : misclassification rati o = \frac{\sum_{test data} I (true y \neq predicted y)}{# observations in the test}$

The following is the preference order of Room Types ($$ Suite>$$ Deluxe>$$ Superior): (1) Deluxe—City View; (2) Deluxe—Water View; (3) Suite—City View; (4) Suite—Water View; (5) Superior—City View; (6) Superior—Water View.

FIGS. 9 and 10 illustrate the results of embodiments of the invention using the experimental dataset to predict the possibility of a particular guest selecting a certain room category. In FIG. 10, 64 coefficients out of 157 coefficients that were set to zero implies that the corresponding variables were excluded from the model. Embodiments use the LASSO method for variable selection. Since a simpler model is used due to selecting some variables, not using all the variables, the prediction accuracy is improved by avoiding the problem overfitting.

FIG. 11 illustrates the results of an experimental study in accordance to embodiments of the invention. In the study, the inventory of rooms are shown at table 1102. The price for the Superior room is varied while keeping all other prices constant and the total revenue is plotted at 1104. As shown, the maximum revenue is determined to be when the superior room price is set at $200.

As disclosed, embodiments provide personalized demand modeling for the hotel rooms based on the guest attributes. Embodiments use machine learning to cluster reservations based on guest attributes, travel attributes, and external factors prior to applying the demand choice-based model to estimate the price elasticity and willingness-to-pay of each guest cluster for different room features.

Embodiments assume that there are several clusters of guests and fit a multinomial choice model for each cluster. When those clustering mechanisms are unobservable, embodiments use a combination of soft-clustering and EM-algorithm as estimation method. Based on the clustered mixture typed choice model, embodiments define an expected revenue and solve the optimization problem to determine the optimal price, which maximizes the expected revenue to each room type for each guest.

Several embodiments are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the disclosed embodiments are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention.

Artificial Intelligence Based Room Personalized Demand Model

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS REFERENCE TO RELATED APPLICATIONS

Provisional Applications (1)