This application relates to transportation services. In particular, the application is directed toward a system to optimize incentive policies for maximizing rewards for using a transportation hailing service.
Recently transportation hailing systems based on a model of matching drivers with passengers via electronic devices have become widespread. Transportation hailing services depend on attracting passengers and retaining drivers. Thus, transportation hailing companies have set up systems to track passengers. Such companies find it advantageous to predict passenger patterns and therefore formulate targeted marketing for such passengers to increase service to such passengers. In order to retain passenger loyalty, transportation hailing companies often send coupons to preferred passengers that allow discounts from fares paid by the customer over the company platform.
Such transportation hailing companies often base marketing strategies around such coupons that are made available to passengers via mechanisms such as sending text messages to personal electronic devices such as smart phones. Such companies have found it desirable to refine marketing strategies of when to distribute coupons, to which passengers, and in what amounts, via an analysis system.
Thus, it is desirable to provide an intelligent marketing system, that applies Artificial Intelligence (AI), Machine Learning (ML), Reinforcement Learning (RL) and other analysis tools to analyze and process a dataset related to high-volume passengers of a transportation hailing system. It would be desirable for a system to track the state of users of the system and provide intelligent allocation of marketing budgets and forecasting profit, so as to achieve maximum return on investment marketing by attracting more drivers and passengers into the shared transportation system. A long-term goal is to provide automatic, intelligent, and personalized maintenance of each driver and passenger, so as to maximize the LTV (Life Time Value) of all the drivers and passengers.
On the passenger side, it would be desirable for a platform that intelligently guides and motivates passengers to maximize life time value (LTV) on the platform through various operational grips such as coupons or other incentives. It would be desirable to establish an intelligent incentive system that automatically tracks the life cycle and current status of each passenger, and automatically configures various operations according to various operational objectives to maximize the life time value (LTV) of passengers on the transportation hailing system.
One method is a reinforcement learning method to determine an optimal strategy or policy for sending coupons to passengers for the purpose of maximizing rewards. Such a learning method is currently based on historical data of passenger interaction with the transportation hailing system. The challenge in a reinforcement learning policy update is how to evaluate the current strategy for issuing coupons to passengers. Since the policy is not implemented in historical data, it is difficult to directly update and evaluate the policy directly with historical trajectories. The focus of the problem thus is how to determine a virtual trajectory generated by the current policy to provide a better passenger incentive strategy.
One example disclosed is a transportation hailing system. The system includes a plurality of client devices. Each of the client devices are in communication with a network and execute an application to request a transportation hailing service. Each of the client devices are associated with one of a plurality of passengers. The system includes a plurality of transportation devices. Each of the transportation devices execute an application displaying information to provide transportation to respond to a request for the transportation hailing service. A database stores state data and action data received from the plurality of client devices and the plurality of transportation devices. The state data is associated with the utilization of the transportation hailing service and the action data is associated with a plurality of incentive actions for each passenger. Each incentive action is associated with a different incentive to a passenger to engage the transportation hailing service. An incentive system is coupled to the plurality of transportation devices and client devices via the network. The incentive system includes a Q-value neural network trained to determine rewards associated with incentive actions from a set of virtual trajectories of states, incentive actions, and rewards, based on a history of the action data and associated state data. The incentive system includes a V-value neural network operable to determine a V-value from the use of the transportation service for each of the plurality of passengers. The V-value is determined by an incentive policy engine operable to order the plurality of passengers according to the associated V-values and determine an incentive policy including selected incentive actions from the plurality of incentive actions based on the determined rewards for each of the plurality of passengers. An incentive server is operable to communicate a selected incentive to at least some of the client devices according to the determined incentive policy via the network.
Another example is a method of determining the distribution of incentives to use a transportation hailing system. The transportation hailing system includes a plurality of client devices. Each of the client devices is in communication with a network and executes an application to request a transportation hailing service. Each of the client devices are associated with one of a plurality of passengers. The transportation hailing services includes a plurality of transportation devices. Each of the transportation devices execute an application displaying information to provide transportation to respond to a request for the transportation hailing service. State data and action data received from the plurality of client devices and the plurality of transportation devices are stored for each passenger in a database. The state data is associated with the utilization of the transportation hailing service and the action data is associated with a plurality of incentive actions. Each incentive action is associated with a different incentive to a passenger to engage the transportation hailing service. A Q-value neural network is trained to determine rewards associated with actions from a set of virtual trajectories of states, incentive actions, and rewards, based on a history of the action data and associated state data from the database. A V-value is determined from the use of the transportation service for each of the plurality of passengers via a V-value neural network. The plurality of passengers is ordered according to the associated V-values via an incentive policy engine. An incentive policy including selected incentive actions from the plurality of incentive actions is determined based on the determined rewards via the incentive policy engine. A selected incentive is communicated to at least some of the client devices via an incentive server according to the determined incentive policy.
Another example is an incentive distribution system including a database storing state data and action data received from a plurality of client devices and a plurality of transportation devices. The state data is associated with the utilization of a transportation hailing service by a plurality of passengers associated with one of the client devices. The action data is associated with a plurality of incentive actions. Each incentive action is associated with a different incentive to a passenger to engage the transportation hailing service. The incentive determination system includes a Q-value determination engine that is trained to determine rewards associated with incentive actions from a set of virtual trajectories of states, incentive actions, and rewards, based on a history of the action data and associated state data from the database. The incentive determination system includes a V-value engine operable to determine a V-value from the use of the transportation service for each of the plurality of passengers. The incentive determination system includes an incentive policy determination engine that is operable to order the plurality of passengers according to the associated V-values; and determine an incentive policy including selected incentive actions from the plurality of incentive actions based on the determined rewards for each of the plurality of passengers. The incentive determination system also includes an incentive server, operable to communicate a selected incentive to at least some of the client devices according to the determined incentive policy.
The above summary is not intended to represent each embodiment or every aspect of the present disclosure. Rather, the foregoing summary merely provides an example of some of the novel aspects and features set forth herein. The above features and advantages, and other features and advantages of the present disclosure, will be readily apparent from the following detailed description of representative embodiments and modes for carrying out the present invention, when taken in connection with the accompanying drawings and the appended claims.
Embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
The present disclosure is susceptible to various modifications and alternative forms. Some representative embodiments have been shown by way of example in the drawings and will be described in detail herein. It should be understood, however, that the invention is not intended to be limited to the particular forms disclosed. Rather, the disclosure is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.
Embodiments of the transportation-hailing platform, such as a car-hailing platform, and related methods are configured to generate a policy to optimize incentives for attracting passengers to increase rewards for the transportation hailing system.
The dispatch system 104 is configured to generate a price for transportation from an origin to a destination, for example in response to receiving a request from a client device 102. For some embodiments, the request is one or more data packets generated at the client device 102. The data packet includes, according to some embodiments, origin information, destination information, and a unique identifier. For some embodiments, the client device 102 generates a request in response to receiving input from a user or passenger, for example from an application running on the client device 102. For some embodiments, origin information is generated by an application based on location information received from the client device 102. The origin information is generated from information including, but not limited to, longitude and latitude coordinates (e.g., those received from a global navigation system), a cell tower, a wi-fi access point, network device and wireless transmitter having a known location. For some embodiments, the origin information is generated based on information, such as address information, input by a user into the client device 102. Destination information, for some embodiments, is input to a client device 102 by a user. For some embodiments, the dispatch system 104 is configured to request origin, destination, or other information in response to receiving a request for a price from a client device 102. Further, the request for information can occur using one or more request for information transmitted from the dispatch system 104 to a client device 102. The dispatch system 104 also serves to accept payments from the client devices 102 for the provided transportation hailing services in contacting one of the transportation devices 112 carried by drivers.
The dispatch system 104 is configured to generate a quote based on a pricing strategy. A pricing strategy is based on two components: 1) a base price which is a fixed price relating to the travel distance, travel time, and other cost factors related to meeting the request for transportation; and 2) a pricing factor which is a multipli-cation over the base price. In this example, the pricing strategy is configured to take into account future effects. For example, the pricing strategy is configured to encourage requests (for example, by a decreased price) that transport a passenger from an area of less demand than supply of transportation and/or pricing power (referred to herein as a “cold area”) to an area that has greater demand than supply of transportation and/or pricing power (referred to herein as a “hot area”). This helps to transform the requests from a passenger having an origin in a cold area into an order, that is a passenger that accepts the price quote for the transportation to a destination in a hot area. As another example, the dispatch system 104 is configured to generate a pricing strategy that discourages an order (for example, by a reasonably increased price) for a request for transportation from hot areas to cold areas.
After a driver is assigned to the passenger and drives the passenger to a hot area, the driver is more likely to be able to fulfill another order immediately. This mitigates the supply-demand imbalance, while potentially benefiting both the ride-hailing platform (with increased profit) and the passengers (with decreased waiting time). The future effect of a bubble pricing strategy is reflected from the repositioning of a driver, from the original position at the current time to the destination of the passenger at a future time.
A digital device, such as the client devices 102 and the transportation devices 112, is any device with a processor and memory. In this example, both the client devices 102 and the transportation devices 112 are mobile devices that include an application to exchange relevant information to facilitate transportation hailing with the dispatch system 104. An embodiment of an example digital device is depicted in
An incentive determination system also allows incentives to engage transportation services to be sent to the client devices 102. The incentive determination system includes a strategy server 120. The strategy server 120 is coupled to a passenger database 122. The passenger database 122 includes passenger background specific data and dynamic passenger usage data. The passenger database 122 receives such data from the strategy server 120. The strategy server 120 derives passenger background data from the client devices 102 and from other sources such as the cloud 124. Thus, other information such as address, gender, income level, etc. may be assembled and data mined by the strategy server 120 to provide a profile for each passenger. The strategy server 120 derives dynamic usage data from the dispatch system 104 to determine summary data of the usage of passengers of the transportation hailing services.
The strategy server 120 is coupled to an incentive server 126 that pushes out different incentives to the client devices 102 via the communication network 110. The incentives in this example are coupons that when redeemed by a passenger result in a reduction in the fee for using the transportation hailing service. The specific passengers and coupon amounts are determined according to the incentive strategy determined by the strategy server 120. In this example, the coupons are discount amounts applied to the prices of rides ordered by passengers from the transportation hailing system 100. The coupons may have a percentage limit of the total value charged for a ride. The coupons may have different limitations such as geographic area, or times that the discount is applied or the amount of the discount. In this example, the coupons are distributed via text messages sent to the client devices 102 via the communication network 110. However, other means may be used to distribute the coupons to passengers such as via emails or other electronic messaging media to the client devices 102 or other digital devices that a passenger may have access to. On receiving the coupon on a client device 102, the passenger may provide an input to the client device 102 to activate the coupon and thus obtain the discount in payments for the transportation hailing service. The coupon may also be activated by the passenger via the transportation hailing application on the client device 102. As explained above, the transportation hailing application will allow a request to be made for the transportation hailing service. The dispatch system 104 will receive the coupon and apply for determining the fee when contacting one of the transportation devices 112. As will be explained below, the strategy server 120 provides a policy for optimal distribution of coupons to the client devices 102 based on virtual trajectories determined from historical trajectory data in order to maximize the rewards from the passengers.
In this example, the Q-value determination engine 204, transition function determination engine 206, and the V-Value engine 208 are neural networks. As will be explained, the Q-value determination engine 204, V-value engine 208, and transition function determination engine 206 are trained from historical passenger data in the database 122 and related data derived from the historical passenger data.
The incentive policy engine 202 produces an incentive strategy that determines different types of coupons to be sent to passengers to maximize rewards to the system 100 in this example. The incentive policy engine 202 dynamically updates the incentive strategy as more use data for the passengers in relation to previous incentives is collected.
Each passenger associated with one of the client devices 102 may have a composition of state statistical data. The state statistical data is compiled for each cycle for all passengers. In this example, the cycle is based on one day, but other cycle periods may be used. The state statistical data includes statistical characteristics (including statistical and data-mining features), real-time characteristics (time, weather, location, etc.), geographic information features (traffic conditions, demand conditions, supply conditions) and so on. The state statistical data is gathered from the strategy server 120 and stored in the database 122.
In this example, actions in the incentive strategy include all of the different types of text messages for coupons sent by the incentive server 126 to the client devices 102. Data relating to the text messages are sent to the database 122 by the exact date, sending cost and the type of coupon sent. The type of the coupon can vary by the content of the coupon such as having different discount values. For the passenger rewards in this example, there are two different type of rewards, an intermediate reward and a long-period reward. The long-period reward provides lifetime value to the passenger. The intermediate reward is the payment of a passenger in the finished order on that day. The long-period reward is life time value (LTV) to accumulate the long time value of the passengers.
The demand for rides by passengers may be formulated as a reinforcement learning and Markov Decision Processes (MDP) problem. The problem may be formulated as a potential MDP <S, A, T, R> and a non-optimal policy, . S is an observation set of the transportation hailing platform, indicating the state of a passenger, such as the number of orders, the time of completion, the price paid, the destination and pick up locations. A is the action set, which indicates the different incentives offered to passengers for using the transportation hailing platform. For example, the action set A may include different coupon amounts or coupons of varying values depending on time or location. T is the transition function, which determines which state the passenger will move to after the action A is executed from the current state, s. R is the reward function, indicating the amount the passenger paid to the transportation hailing platform in a certain state, s.
**Here the trajectory set of a sample of passengers is D={τ1, τ2 . . . τm}, where τi=(s0i, a1i, r1i, . . . sl
J=(π*)=Eπ*[Σt=1∞γt−1rt].
In this equation, π* is the optimal policy the model attempts to learn, E is expectation function of the reward for the activity, r is the reward, rt is the reward in the t-th step on the whole trajectory, γt−1 is the discount factor. The equation Eπ*[Σt=1∞γt−1rt] is therefore provided to maximize the expectation of total summation of rewards of a trajectory by the optimal policy function π*.
The disclosed method for determining an optimal incentive strategy focuses on motivating passengers to use the transportation hailing platform. In order to better motivate passengers to complete transportation orders on the platform, a MDP may be built to model the passengers' incentives problem. Currently available data on passengers from the database 122 include daily status, daily payments, and incentives sent via text messages. The passenger's daily status data, in this example, includes the daily characteristics of the passenger in relation to ride data and data mining characteristics of the passenger, which may be obtained from the client devices 102 or other sources. The daily status data in this example does not contain real-time features. The passenger's daily payments data includes payments received from the passenger for transportation hailing services. The marketing data sent to each passenger is derived from historical data reflecting sent messages with coupons from the incentive server 126 to the client devices 102.
In the reinforcement learning task, for strategy, π, the cumulative reward, J(π) may be stated as:
J(π)=Eπ[Σt=1∞γt−1rt],
This may be rewritten as:
J(π)=∫τpπ(τ)R(τ)dτ
where pπ(τ)=p(s0)Πt=1Lπ(at|st−1)T(st|at, st−1) is the probability of trajectory generation, and R(τ) is the cumulative reward in one trajectory, s is the current state, and a is an action such as one of the coupon types that may be sent to the passengers.
If the data set D contains all possible trajectories, then even if D is a static data set, it can be used to evaluate any incentive policy. However, there are only a few trajectories in the data set D from historical data. Because the policy, π, has not been executed in the historical data, it is difficult to update and evaluate the policy π directly with the historical trajectory. Thus, a virtual trajectory set of the current policy, π, is constructed in this example. The virtual trajectory set is then used in evaluating and updating the current strategy, π, through the historical trajectory reorganization method as will be described below.
In this scheme, the inventive policy, π, is represented by the Q-value determination made by the Q-value determination engine 204 in
After solving the problem of the amount of the coupon for each individual passenger, the next concern is how to select the target passenger population to receive the coupons under a given budget. Here, the method considers the use of people with higher V-values to issue coupons, where the V-value is the sum of the payments of a passenger from the day to the end of a predetermined time period. The V-value function is a value function that is only related to the state, s, which represents the expectation of the reward that the current state, s, can obtain. The larger the V-value, the higher the Gross Merchandise Volume (GMV) that the system can obtain. The V-value is calculated for each passenger via the V-value determination engine 208 in
After obtaining the V-value from the V-value determination engine 208 in
The overall budget allocation process is shown in
The system ranks the population of passengers based on the respective V-value for each passenger as explained above (310). A reconstruction trajectory is used as another input to train the Q-value determination engine 204 (312). The output of the Q-value determination engine 204 is used to update the overall policy (314). The output of the Q-value determination engine 204 is used to update the incentive policy via the policy engine 202 (314). The policy engine 202 determines the updated policy of issuing coupons to passengers. The updated policy provides an input to the reconstruction trajectory determination (312) and the transition network (306).
After completion of the policy update (314) and the ranking based on V-value (310), the system issues coupons by ranking groups and existing policy until the overall budget for such coupons is exhausted (316). The coupons then may be distributed to passengers via the incentive server 126 in
The core of the budget allocation framework is how to carry out the policy update process (314) in
Considering that there is a large amount of historical data in relation to passengers in the database 122, the historical trajectory reorganization method is used to reconstruct the trajectory generated by the current policy, π, and then use the newly generated trajectory to evaluate and update the current strategy.
Two modules, the current policy, π, determination engine 202 and the transition function, T, engine 206 are required to obtain the reorganization trajectory in this example. The Q-value determination engine 204 is used to update the incentive policy based on the inputs from the current policy engine 202 and transition function engine 208.
In this example, the Deep Q-Learning Network (DQN) method is used to train and update the Q-value determination engine 204. In this model, the Deep Q-learning network model is trained based on the historical transition data or reconstruction trajectories without building a simulator. This is done by pooling data in the replay buffer and sampling mini-batches.
The process of training the Deep Q-learning network of the engine 204 is shown in
The transition function (TP) and the transition probability function (TFP) are trained by breaking up the original data and then passing the broken data to supervise the training. The so-called breakup is to transform the original trajectory τ=(s0, a1, r1, . . . , sn, rn) into a data set U={(si, ai, ri, s′)}i=1n−1 the state level. The transition engine 206 in the system learns the two transition models according to the data obtained after breaking up the original data. The first transition model is the transition function, TF, which takes an action state pair as an input and produces the next state, s′, as the target output. The second transition model is the transition probability function, TPF, that directly takes the disbanded triple (s, a, s′) as an input, and outputs whether the triple has appeared in the historical state (1 appeared, 0 did not appear). In this example, s is the current state, a is the action, and s′ is the next state.
Although the data is marked 0 or 1 when training the transition probability function TPF, when the network training is completed in this example, the transition probability function can output a real value of the interval, which indicates the probability of the corresponding triple (s, a, s′) composition.
A tuple of each initial state and action provides an input 520 into a transition model, TF 522. The transition function TF model 522 takes the action state pair for each of the broken data as an input 520 and produces a next state, s′, as a target output 524. A triple of each initial state, action and subsequent state provides an input 530 into a transition probability function TPF model 532. The transition probability function TPF model 532 directly takes the disbanded triple state, action and next state (s, a, s′) as the input 530, and produces an output 534 whether the next state, s′, has appeared in from the state, s, (1 appeared, 0 did not appear). As explained above, the output may alternatively be a probability between 1 and 0 of whether the next state, s′ has appeared.
The process of generating a trajectory is starting from the initial state s0, the policy, π, generates an action at=π(st−1), and according to the transition function, T, the next state is st=T(st−1, at). Given the input of the current state and action (s, a) and the output of the next state, s′, determined by the transition function, if s′ appears in the historical state, then a one-step trajectory construction is completed. If the s′ output by the transition function is not in the historical state, Shis, then the output s′ to the historical state set, Shis, needs to be mapped. One way of mapping is to find the K-nearest neighbor Sk={s1′, s2′, . . . , sk′} of s′ in the historical state set Shis. ∀si′∈Sk, according to transition probability function TPF, and a historical state with the highest transition probability from Sk is picked as the next state, where the next state s*=maxs
The reward of the triple (s, a, s′) in a particular construction trajectory is determined by this process. Rewards can be discussed in two situations, if the reward has appeared historically, or if it has not. First, if the triple (s, a, s) has appeared in the historical state level dataset, U, that is (s, a, s′)∈U, then the reward r(s, a, s′) can be obtained directly from the historical data. If the triple has appeared many times in the historical dataset, the average of all the rewards is taken as the reward value. If the triple (s, a, s′) has not appeared in the historical state, then the mean reward on state s′, which can be written as E({tilde over (s)},ã,s′)∈Ur({tilde over (s)}, ã, s′) and is set as the reward of r(s, a, s′), where {tilde over (s)} is the set of states whose next state are s′. If the triple (s, a, s′) does not exist in the historical dataset, according to some exemplary systems, the mean value of the rewards for the state s′, that is Σ
The complete process of obtaining a virtual trajectory from historical data can be seen in the routine shown in
The routine then determines whether the state cycle, i, is less than the trajectory length, L (604). If the state set is not at the end of the trajectory length, the system sets the action of the period to the policy state ai=π(si) 606). The routine then gets the next state as the input to the transition model, {tilde over (s)}i+1=T(si, ai) (608). The routine gets the K value K N N ({tilde over (s)}i+1={si+1, 1, si+1, 2, . . . , si+1,k}, ∀s ∈K N N ({tilde over (s)}i+1), s ∈Shis.) (610). Thus, the routine gest the K nearest and similar state for the state cycle, si, for all states. The next state, s*i+1, is set to the maximum of the output of the transition probability function, s*i+1=maxs∈KNN({tilde over (s)}
The total process of obtaining a policy update is shown in
Then the Deep Q-Learning (DQN) method described above in
Then in the i-th iteration, according to the existing policy, πi−1, transition function and transition probability function, the trajectory Traji may be reconstructed by the routine shown in
After the policy update is completed, the coupon amount for individual users can be determined through the reinforcement learning policy. The budget allocation process also must choose the passenger for the incentive action. The example method considers selecting passengers with higher selection values to issue coupons, where the value is the sum of the GMVs of passengers from the initial day to the end of the optimization horizon. The V-value function is a value function only related to the state, s, which represents the expectation of the reward that the current state, s, can obtain. The higher the V value, the higher the GMV paid by the passenger. Here the original data is used for the V-value determination engine 208 to learn the V-values.
After determining the V-values from the V-value engine 208, passengers are sorted by V-value from highest to lowest. The updated policy 2r is then used to issue coupons to these passengers in turn starting from the highest valued passengers until the budget is exhausted.
The framework of the budget allocation process is shown in the routine in
First, the V-value function is trained based on the original data Do, and then the V-value is predicted for all current users according to the V-value function (802). Thus, for state, s ∈ Snow, the routine gets the V value of V(s) and gets the total V value set to V(Snow).
The second step of the process is to arrange the V values in descending order to get Vsort, and then arrange the corresponding states according to their V values to get the state of all passengers Ssort (804). Thus, the set V(Snow) is sorted by descending order. The sorted values, Vsort is obtained by getting the corresponding Ssort, ∀i, Vsort[i]=V (Ssort[i]). After the sorting, the coupon dictionary Dcoupon is set to null, j=0 (806). The routine then determines whether the budget, B is larger than zero (808).
As long as the budget, B, is larger than zero, the routine uses the policy to determine the action of which type of coupon to send and subtracts the cost of the coupon from the budget (810). Thus, each action is determined by a=π (Ssort[j]), B=B−cost (a). The key of the dictionary Dcoupon is set to the ID of the passenger, and then the amount of the coupon for each ID is stored (812). Then, according to the people after the ordering, coupons are issued according to the trained policy π in turn until the budget is consumed, and the coupon ID and the amount of coupons are stored in Dcoupon. This process is repeated until the budget is exhausted (808). Then the dictionary, Dcoupon, is returned to get a complete one-day coupon specific solution (814).
The effectiveness and efficiency of the example MDP model was demonstrated by conducting an online experiment across a large number of cities using the example incentive determination system. The passengers in the experiment were ones who finished the first order within 30 days. In this example, the number of passengers was 466,495. The passengers were partitioned into three groups, a control group, a baseline method group and an MDP model group on average. The time of online experiment was for 21 days. The metrics of the experiment are summarized as follows:
The online experiment result of 21 days may be shown in the following table shown in
The techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be desktop computer systems, server computer systems, portable computer systems, handheld devices, networking devices or any other device or combination of devices that incorporate hard-wired and/or program logic to implement the techniques. Computing device(s) are generally controlled and coordinated by operating system software. Conventional operating systems control and schedule computer processes for execution, perform memory management, provide file system, networking, I/O services, and provide a user interface functionality, such as a graphical user interface (“GUI”), among other things.
The computer system 1000 also includes a main memory 1006, such as a random access memory (RAM), cache and/or other dynamic storage devices, coupled to bus 1002 for storing information and instructions to be executed by processor 1004. Main memory 1006 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1004. Such instructions, when stored in storage media accessible to processor 1004, render computer system 1000 into a special-purpose machine that is customized to perform the operations specified in the instructions. The computer system 1000 further includes a read only memory (ROM) 1008 or other static storage device coupled to bus 1002 for storing static information and instructions for processor 1004. A storage device 1010, such as a magnetic disk, optical disk, or USB thumb drive (Flash drive), etc., is provided and coupled to bus 1002 for storing information and instructions.
The computer system 1000 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1000 to be a special-purpose machine. According to one embodiment, the operations, methods, and processes described herein are performed by computer system 1000 in response to processor(s) 1004 executing one or more sequences of one or more instructions contained in main memory 1006. Such instructions may be read into main memory 1006 from another storage medium, such as storage device 1010. Execution of the sequences of instructions contained in main memory 1006 causes processor(s) 1004 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
The main memory 1006, the ROM 1008, and/or the storage 1010 may include non-transitory storage media. The term “non-transitory media,” and similar terms, as used herein refers to any media that store data and/or instructions that cause a machine to operate in a specific fashion. Such non-transitory media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 1010. Volatile media includes dynamic memory, such as main memory 1006. Common forms of non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge, and networked versions of the same.
The computer system 1000 also includes a network interface 1018 coupled to bus 1002. Network interface 1018 provides a two-way data communication coupling to one or more network links that are connected to one or more local networks. For example, network interface 1018 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, network interface 1018 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN (or WAN component to communicated with a WAN). Wireless links may also be implemented. In any such implementation, network interface 1018 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
The computer system 1000 can send messages and receive data, including program code, through the network(s), network link and network interface 1018. In the Internet example, a server might transmit a requested code for an application program through the Internet, the ISP, the local network and the network interface 1018.
The received code may be executed by processor 1004 as it is received, and/or stored in storage device 1010, or other non-volatile storage for later execution.
Each of the processes, methods, and routines described in the preceding sections may be embodied in, and fully or partially automated by, code modules executed by one or more computer systems or computer processors comprising computer hardware. The processes and methods may be implemented partially or wholly in application-specific circuitry.
The various features and processes described above may be used independently of one another, or may be combined in various ways. All possible combinations and sub-combinations are intended to fall within the scope of this disclosure. In addition, certain method or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence, and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example, described blocks or states may be performed in an order other than that specifically disclosed, or multiple blocks or states may be combined in a single block or state. The exemplary blocks or states may be performed in serial, in parallel, or in some other manner. Blocks or states may be added to or removed from the disclosed exemplary embodiments. The exemplary systems and components described herein may be configured differently than described. For example, elements may be added to, removed from, or rearranged compared to the disclosed exemplary embodiments.
The various operations of exemplary methods described herein may be performed, at least partially, by an algorithm. The algorithm may be comprised in program codes or instructions stored in a memory (e.g., a non-transitory computer-readable storage medium described above). Such algorithm may comprise a machine learning algorithm. In some embodiments, a machine learning algorithm may not explicitly program computers to perform a function, but can learn from training data to make a predictions model that performs the function.
The various operations of exemplary methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented engines that operate to perform one or more operations or functions described herein.
Similarly, the methods described herein may be at least partially processor-implemented, with a particular processor or processors being an example of hardware. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented engines. Moreover, the one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), with these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an Application Program Interface (API)).
The performance of certain of the operations may be distributed among the processors, not only residing within a single machine, but deployed across a number of machines. In some exemplary embodiments, the processors or processor-implemented engines may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other exemplary embodiments, the processors or processor-implemented engines may be distributed across a number of geographic locations.
Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. Structures and functionality presented as separate components in exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein.
Although an overview of the subject matter has been described with reference to specific exemplary embodiments, various modifications and changes may be made to these embodiments without departing from the broader scope of embodiments of the present disclosure. Such embodiments of the subject matter may be referred to herein, individually or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single disclosure or concept if more than one is, in fact, disclosed.
The embodiments illustrated herein are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed. Other embodiments may be used and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. The Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.
Any process descriptions, elements, or blocks in the flow diagrams described herein and/or depicted in the attached figures should be understood as potentially representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process. Alternate implementations are included within the scope of the embodiments described herein in which elements or functions may be deleted, executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those skilled in the art.
As used herein, the term “or” may be construed in either an inclusive or exclusive sense. Moreover, plural instances may be provided for resources, operations, or structures described herein as a single instance. Additionally, boundaries between various resources, operations, engines, and data stores are somewhat arbitrary, and particular operations are illustrated in a context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within a scope of various embodiments of the present disclosure. In general, structures and functionality presented as separate resources in the exemplary configurations may be implemented as a combined structure or resource. Similarly, structures and functionality presented as a single resource may be implemented as separate resources. These and other variations, modifications, additions, and improvements fall within a scope of embodiments of the present disclosure as represented by the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2019/091247 | 6/14/2019 | WO |