The following relates generally to computer networking, and more specifically to resource allocation.
A computer network may include a set of computing devices that operate as network nodes using shared resources, such as computing power, storage, bandwidth, energy, etc. Resource allocation is a task in computer networking that determines how many of the shared resources should be provided to each network node.
However, a computer network may not efficiently allocate the shared resources. For example, a node may be provided with a predetermined number of resources regardless of current need, leaving those resources idle when they could be employed elsewhere. Additionally, when the resources are allocated to nodes in exchange for payment, a node that has been charged for the use of idle resources may not be aware that it is incurring costs.
A method for resource allocation is described. One or more aspects of the method include receiving utilization data for computing resources shared by a plurality of users; updating a pricing agent using a reinforcement learning model based on the utilization data; identifying resource pricing information using the pricing agent; and allocating the computing resources to the plurality of users based on the resource pricing information.
A method for resource allocation is described. One or more aspects of the method include receiving utilization data for computing resources shared by a plurality of users; identifying resource pricing information using a pricing agent based on the utilization data; providing a computing resource budget to each of the plurality of users based on the resource pricing information; generating utilization recommendations for each of the plurality of users based on the resource pricing information and the computing resource budget; receiving resource requests from one or more of the plurality of users in response to the utilization recommendations; and allocating the computing resources to the plurality of users based on the resource requests.
An apparatus for resource allocation is described. One or more aspects of the apparatus include a utilization data component configured to generated utilization data for computing resources shared by a plurality of users; a pricing agent configured to identify resource pricing information based on a reinforcement learning model; and a resource allocation component configured to allocate the computing resources to the plurality of users based on the resource pricing information.
The present disclosure provides systems and methods for resource allocation that may receive utilization data for computing resources shared by a plurality of users, update a pricing agent using a reinforcement learning model based on the utilization data, identify resource pricing information using the pricing agent, and allocate the computing resources to the plurality of users based on the resource pricing information.
Resource allocation is a task in computer networking that determines how many of the shared resources should be provided to each network node. Resources may be allocated on a predetermined basis, where users of the computer network are free to use or not use the resources as needed. However, this allocation system is inefficient, as the users are not incentivized to de-allocate unneeded resources, and when the resources are allocated to users in exchange for payment, a user that has been charged for the use of idle resources may not be aware that they are incurring costs.
To more efficiently allocate computing resources, an embodiment of the present disclosure includes a machine learning model that collects utilization data and be trained based on the collected utilization data. The machine learning model identifies resource pricing information, and allocate the resources to the users based on the resource pricing information.
Accordingly, at least one embodiment of the present disclosure learns about resource utilization in a computer network, and then efficiently allocates resources to users of the network based on the knowledge of resource utilization and resource pricing, so that the resources are intelligently allocated both according to need and an ability to pay for them.
At least one embodiment of the present disclosure may be used in a resource allocation context. For example, a set of users has access to a pool of shared computing resources (such as software, hardware, software that employs distributed hardware, cloud computing resources, etc.), and an embodiment of the present disclosure updates a neural-network based pricing agent via a training component using a reinforcement learning model based on utilization data. By considering price in a training process, the pricing agent learns over time how to set a price for a given period of time, and by allocating the computing resources to the set of users based on the pricing information, computing resource utilization among the set of users is maximized.
The term “utilization data” refers to data that may include identifications for one or more users, identifications of one or more groups a given user is associated with, the number and kinds of resources that are or were allocated to each of the users over a certain time period, and/or whether an allocated resource was used by a user over a certain time period. The utilization data may be organized as user blocks.
The term “computing resources” refers to a resource that is shared among users, such as software, hardware, and/or software that employs distributed hardware. In some examples, the computing resources are graphical processing units (GPUs), and their processing power may be shared and utilized by one or more user devices via a cloud network.
The term “pricing agent” refers to a component that includes one or more neural networks that are updated using a reinforcement learning model based on the utilization data. By considering the utilization data in the training process, the pricing agent learns over time how to set optimal resource pricing information that results in maximum computing resource utilization for a given period of time.
The term “resource pricing information” refers to “prices” calculated by the pricing agent to maximize computing resource utilization among a group of users. The term “price” indicates that users may purchase the computing resources according to a computing resource budget that measures resource pricing information against available credit in the budget. The budget may directly correspond to a non-periodic payment into a user account balance (where, for example, each credit in the user account equates to having a credit available in the computing resource budget), or may correspond to a budget that is determined on a periodic basis (where, for example, a user is given a budget of ten credits per month), or may correspond to another appropriate form of budgeting. The resource pricing information corresponds to these credits, and a user's budget is debited when a computing resource is allocated to the user.
An example application of the inventive concept in the resource allocation context is provided with reference to
Referring to
User device 105 may be a personal computer, laptop computer, mainframe computer, palmtop computer, personal assistant, mobile device, or any other suitable processing apparatus. In some examples, user device 105 includes software that communicates with machine learning apparatus 110, cloud 115, and database 120 to receive and display utilization data, computing resource pricing information, computing resource budgets, utilization requests, and/or computing resource allocation notifications. In some examples, when machine learning apparatus 110 allocates the computing resources to user 100, user device 105 is provided with additional functionality and/or processing power. For example, the computing resource may be a GPU, and when machine learning apparatus 110 allocates the GPU to user 100, user device 105 may use the GPU in processing tasks via a mobile or cloud-based software application.
Machine learning apparatus 110 may include a computer implemented network that includes one or more neural networks. Machine learning apparatus 110 may also include one or more processors, a memory subsystem, a communication interface, an I/O interface, one or more user interface components, and a bus. Additionally, machine learning apparatus 110 may communicate with user device 105 and database 120 via cloud 115.
In some cases, machine learning apparatus 110 is implemented on a server. A server provides one or more functions to users 100 linked by way of one or more of the various networks. In some cases, the server includes a single microprocessor board, which includes a microprocessor responsible for controlling all aspects of the server. In some cases, a server uses microprocessor and protocols to exchange data with other devices or users on one or more of the networks via hypertext transfer protocol (HTTP), and simple mail transfer protocol (SMTP), although other protocols such as file transfer protocol (FTP), and simple network management protocol (SNMP) may also be used. In some cases, a server is configured to send and receive hypertext markup language (HTML) formatted files (e.g., for displaying web pages). In various embodiments, a server comprises a general purpose computing device, a personal computer, a laptop computer, a mainframe computer, a super computer, or any other suitable processing apparatus.
Further detail regarding the architecture of machine learning apparatus 110 is provided with reference to
A cloud such as cloud 115 is a computer network configured to provide on-demand availability of computer system resources, such as data storage and computing power. In some examples, cloud 115 provides resources without active management by user 100. For example, the computing resources may be included in cloud 115. The term cloud is sometimes used to describe data centers available to many users over the Internet. Some large cloud networks have functions distributed over multiple locations from central servers. A server is designated an edge server if it has a direct or close connection to a user. In some cases, cloud 115 is limited to a single organization. In other examples, cloud 115 is available to many organizations. In one example, cloud 115 includes a multi-layer communications network comprising multiple edge routers and core routers. In another example, cloud 115 is based on a local collection of switches in a single physical location.
A database such as database 120 is an organized collection of data. For example, database 120 stores data in a specified format known as a schema. Database 120 may be structured as a single database, a distributed database, multiple distributed databases, or an emergency backup database. In some cases, a database controller may manage data storage and processing in database 120. In some cases, user 100 interacts with the database controller. In other cases, the database controller may operate automatically without user interaction.
At operation 205, the system receives utilization data. In some cases, the operations of this step refer to, or may be performed by, a machine learning apparatus as described with reference to
At operation 210, the system sets resource “prices”. In some cases, the operations of this step refer to, or may be performed by, a machine learning apparatus as described with reference to
At operation 215, the user provides a utilization request based on the resource “prices”. In some cases, the operations of this step refer to, or may be performed by, a user as described with reference to
At operation 220, the system allocates resources. In some cases, the operations of this step refer to, or may be performed by, a machine learning apparatus as described with reference to
An apparatus for resource allocation is described. One or more aspects of the apparatus include a utilization data component configured to generated utilization data for computing resources shared by a plurality of users; a pricing agent configured to identify resource pricing information based on a reinforcement learning model; and a resource allocation component configured to allocate the computing resources to the plurality of users based on the resource pricing information.
In some aspects, a utilization recommender configured to generate utilization recommendations for the plurality of users based on the reinforcement learning model. In some aspects, the utilization data component is configured to generating a time series of resource utilization for the plurality of users based on the utilization data. In some aspects, the resource allocation component is configured to provide a resource budget to each of the plurality of users, and to receive resource requests, wherein the allocation of the computing resources is based on the resource budget and the resource requests.
In some aspects, the pricing agent is configured to generate resource prices for each of a plurality of time periods, wherein the allocation of the computing resources is based on the resource prices. In some aspects, a training component configured to update the pricing agent using a reinforcement learning model.
Processor unit 300 includes one or more processors. A processor is an intelligent hardware device, (e.g., a general-purpose processing component, a digital signal processor (DSP), a central processing unit (CPU), a graphics processing unit (GPU), a microcontroller, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic device, a discrete gate or transistor logic component, a discrete hardware component, or any combination thereof). In some cases, processor unit 300 is configured to operate a memory array using a memory controller. In other cases, a memory controller is integrated into processor unit 300. In some cases, processor unit 300 is configured to execute computer-readable instructions stored in memory unit 305 to perform various functions. In some embodiments, processor unit 300 includes special purpose components for modem processing, baseband processing, digital signal processing, or transmission processing.
Memory unit 305 includes one or more memory devices. Examples of a memory device include random access memory (RAM), read-only memory (ROM), or a hard disk. Examples of memory devices include solid state memory and a hard disk drive. In some examples, memory is used to store computer-readable, computer-executable software including instructions that, when executed, cause a processor of processor unit 300 to perform various functions described herein. In some cases, memory unit 305 contains, among other things, a basic input/output system (BIOS) which controls basic hardware or software operation such as the interaction with peripheral components or devices. In some cases, memory unit 305 includes a memory controller that operates memory cells of memory unit 305. For example, the memory controller may include a row decoder, column decoder, or both. In some cases, memory cells within memory unit 305 store information in the form of a logical state.
Machine learning model 320 may include one or more artificial neural networks (ANNs). An ANN is a hardware or a software component that includes a number of connected nodes (i.e., artificial neurons) that loosely correspond to the neurons in a human brain. Each connection, or edge, transmits a signal from one node to another (like the physical synapses in a brain). When a node receives a signal, it processes the signal and then transmits the processed signal to other connected nodes. In some cases, the signals between nodes comprise real numbers, and the output of each node is computed by a function of the sum of its inputs. In some examples, nodes may determine their output using other mathematical algorithms (e.g., selecting the max from the inputs as the output) or any other suitable algorithm for activating the node. Each node and edge is associated with one or more node weights that determine how the signal is processed and transmitted.
In ANNs, a hidden (or intermediate) layer includes hidden nodes and is located between an input layer and an output layer. Hidden layers perform nonlinear transformations of inputs entered into the network. Each hidden layer is trained to produce a defined output that contributes to a joint output of the output layer of the neural network. Hidden representations are machine-readable data representations of an input that are learned from a neural network's hidden layers and are produced by the output layer. As the neural network's understanding of the input improves as it is trained, the hidden representation is progressively differentiated from earlier iterations.
During a training process of an ANN, the node weights are adjusted to improve the accuracy of the result (i.e., by minimizing a loss function which corresponds in some way to the difference between the current result and the target result). The weight of an edge increases or decreases the strength of the signal transmitted between nodes. In some cases, nodes have a threshold below which a signal is not transmitted at all. In some examples, the nodes are aggregated into layers. Different layers perform different transformations on their inputs. The initial layer is known as the input layer and the last layer is known as the output layer. In some cases, signals traverse certain layers multiple times.
The term “loss function” refers to a function that impacts how a machine learning model is trained in a supervised learning model. Specifically, during each training iteration, the output of the model is compared to the known annotation information in the training data. The loss function provides a value for how close the predicted annotation data is to the actual annotation data. After computing the loss function, the parameters of the model are updated accordingly and a new set of predictions are made during the next iteration.
In one aspect, machine learning model 315 includes utilization data component 320, pricing agent 325, resource allocation component 330, and utilization recommender 335. Each of utilization data component 320, pricing agent 325, resource allocation component 330, and utilization recommender 335 may include one or more ANNs.
According to some aspects, utilization data component 320 receives utilization data for computing resources shared by a set of users. In some examples, utilization data component 320 generates a time series of resource utilization for the set of users based on the utilization data. In some examples, a reinforcement learning model is based on the time series. In some examples, utilization data component 320 identifies a utilization value for a time period based on the utilization data. In some examples, utilization data component 320 predicts a utilization for a time period based on the reinforcement learning model. In some aspects, the computing resources include GPUs configured for machine learning.
According to some aspects, pricing agent 325 identifies resource pricing information. In some examples, pricing agent 325 identifies the resource pricing information based on the utilization data. In some examples, pricing agent 325 identifies the resource pricing information based on a reinforcement learning model. In some examples, pricing agent 325 selects a resource price for a time period from a set of candidate resource prices, where the pricing information includes the resource price. In some aspects, pricing agent 325 is configured to generate resource prices for each of a set of time periods, where the allocation of the computing resources is based on the resource prices. Pricing agent 325 is an example of, or includes aspects of, the corresponding element described with reference to
According to some aspects, resource allocation component 330 allocates the computing resources to the set of users based on the resource pricing information. In some examples, resource allocation component 330 allocates a resource budget to a user of the set of users. In some examples, resource allocation component 330 receives a resource request from a user. In some examples, resource allocation component 330 allocates a portion of the computing resources to the user based on the request. In some examples, resource allocation component 330 deducts a price value from the resource budget based on the resource pricing information. In some examples, resource allocation component 330 allocates a resource budget to a user of the set of users. In some examples, resource allocation component 330 receives a resource request from a user. In some examples, resource allocation component 330 determines that the resource request exceeds a remaining amount of the resource budget. In some examples, resource allocation component 330 refrains from providing the computing resources to the user based on the determination.
According to some aspects, resource allocation component 330 provides a computing resource budget to each of the set of users based on the resource pricing information. In some examples, resource allocation component 330 receives resource requests from one or more of the set of users in response to the utilization recommendations. In some examples, resource allocation component 330 allocates the computing resources to the set of users based on the resource requests. In some examples, resource allocation component 330 deducts a price value from the resource budget of a user based on the allocation of the computing resources and the resource pricing information. In some examples, resource allocation component 330 determines that the resource request exceeds a remaining amount of a resource budget of a user. In some examples, resource allocation component 330 refrains from providing the computing resources to the user based on the determination. Resource allocation component 330 is an example of, or includes aspects of, the corresponding element described with reference to
According to some aspects, utilization recommender 335 generates a utilization recommendation for a user based on the predicted utilization. According to some aspects, utilization recommender 335 generates utilization recommendations for each user of the set of users based on the resource pricing information and the computing resource budget. In some aspects, utilization recommender 335 is configured to generate utilization recommendations for the set of users based on the reinforcement learning model. Utilization recommender 335 is an example of, or includes aspects of, the corresponding element described with reference to
According to some aspects, training component 310 updates pricing agent 325 using a reinforcement learning model based on the utilization data. In some examples, training component 310 computes a reward for the time period based on the utilization value, where the reinforcement learning model is based on the reward.
According to some aspects, training component 310 updates pricing agent 325 based on the time series. In some examples, training component 310 computes a reward for the time period based on the utilization value. In some examples, training component 310 updates the pricing agent 325 using a reinforcement learning model based on the reward.
Pricing agent 405 is an example of, or includes aspects of, the corresponding element described with reference to
Referring to
A method for resource allocation is described. One or more aspects of the method include receiving utilization data for computing resources shared by a plurality of users; updating a pricing agent using a reinforcement learning model based on the utilization data; identifying resource pricing information using the pricing agent; and allocating the computing resources to the plurality of users based on the resource pricing information.
Some examples of the method and apparatus further include generating a time series of resource utilization for the plurality of users based on the utilization data, wherein the reinforcement learning model is based on the time series. Some examples of the method and apparatus further include identifying a utilization value for a time period based on the utilization data. Some examples further include computing a reward for the time period based on the utilization value, wherein the reinforcement learning model is based on the reward.
Some examples of the method and apparatus further include selecting a resource price for a time period from a plurality of candidate resource prices, wherein the pricing information comprises the resource price. Some examples of the method and apparatus further include allocating a resource budget to a user of the plurality of users. Some examples further include receiving a resource request from a user. Some examples further include allocating a portion of the computing resources to the user based on the request. Some examples further include deducting a price value from the resource budget based on the resource pricing information.
Some examples of the method and apparatus further include allocating a resource budget to a user of the plurality of users. Some examples further include receiving a resource request from a user. Some examples further include determining that the resource request exceeds a remaining amount of the resource budget. Some examples further include refraining from providing the computing resources to the user based on the determination.
Some examples of the method and apparatus further include predicting a utilization for a time period based on the reinforcement learning model. Some examples further include generating a utilization recommendation for a user based on the predicted utilization. In some aspects, the computing resources comprise GPUs configured for machine learning.
Referring to
At operation 505, the system receives utilization data for computing resources shared by a set of users. In some cases, the operations of this step refer to, or may be performed by, a utilization data component as described with reference to
For example, the utilization data component receives utilization data from a database such as the database described with reference to
The utilization data may thus be organized as user blocks. A user block may include congruent days, and a utilization value for a particular allocated resource on a particular day is represented as a value ∈[0,100], where 0 represents that the user has not used an allocated resource at all.
The utilization data component may then determine utilized resources (i.e., utilization multiplied by resources) yit at a given period t allocated to a user i at a block b at day t of the block b by generating a time-series statistical model:
where T(b′) denotes the total number of days in block b′ of user i, and mi is an n-dimensional binary vector that indicates group membership of user i (given a global intercept θ1, parameter identification implies that the dimensionality of mi equals the total number of groups minus one), and ϵibt is the model's error term.
Thus, the parameters θ=(θ1, θ2, θ3, θ4, θ5, θ6, θ7:14) correspond to (excluding the intercept θ1) a lagged response, a mean lagged response in the block excluding the response from t−1, an absolute difference in the response of the two last periods, a total resource allocation in past blocks, an index of a period in the block, and a group membership.
At operation 510, the system updates a pricing agent using a reinforcement learning model based on the utilization data. In some cases, the operations of this step refer to, or may be performed by, a training component as described with reference to
Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Specifically, reinforcement learning relates to how software agents make decisions in order to maximize a reward. The decision making model may be referred to as a policy. This type of learning differs from supervised learning in that labelled training data is not needed, and errors need not be explicitly corrected. Instead, reinforcement learning balances exploration of unknown options and exploitation of existing knowledge. In some cases, the reinforcement learning environment is stated in the form of a Markov decision process (MDP) based on a set of environment and agent states, a set of actions of the agent, a probability of a state transition under an action, and a reward for transitioning from one state to another during the action.
Furthermore, many reinforcement learning algorithms utilize dynamic programming techniques. However, one difference between reinforcement learning and other dynamic programming methods is that reinforcement learning does not require an exact mathematical model of the MDP. Therefore, reinforcement learning models may be used for large MDPs where exact methods are impractical.
In some embodiments, the training component computes a reward for the time period based on the utilization value, where the reinforcement learning model is based on the reward. In some embodiments, the training component updates the pricing agent based on the time series. In some examples, training component 310 computes a reward for the time period based on the utilization value. In some examples, training component 310 updates the pricing agent 325 using a reinforcement learning model based on the reward.
In some examples, the training component computes a reward for the time period based on the utilization value, where the reinforcement learning model is based on the reward. In some example, the training component updates the pricing agent based on the time series. In some examples, the training component computes a reward for the time period based on the utilization value. In some examples, the training component updates the pricing agent using a reinforcement learning model based on the reward.
For example, at a given period t∈[T], the training component uses data provided by the pricing agent to train the pricing agent to calculate a reward by setting a price such that resource utilization in the period is maximized, where the utilization of resources RESs at period t among N users is:
where URES represents utilized resources where DRES represents demanded resources. As URESit≤DRESit, it follows that utilizationt ∈[0, 1].
At operation 515, the system identifies resource pricing information using the pricing agent. For example, the system may set resource pricing information to determine a reasonable price for computing assets. In some cases, the operations of this step refer to, or may be performed by, a pricing agent as described with reference to
For example, at a period t, the pricing agent may use a pricing model:
where Xt are covariates and yt are corresponding response variables. In an embodiment, the pricing agent uses a Linear Regression pricing model. In this case, the term priceτ2 prevents the pricing agent from predicting a best price as either 0 or infinity.
Then, at each period t∈[T], the pricing agent considers a set of candidate prices CPt:
where ESN represents 50 evenly spaced numbers in the interval that follows ESN in equation (5).
Given a computing resource budget at the beginning of period t and demanded resources at period t−1, the pricing agent considers a covariate vector, predicts a corresponding utilization for each price in the set of candidate prices CPt, and chooses or selects resource pricing information from the set of candidate prices CPt, that corresponds to a highest predicted utilizationt. The system may calculate a computing resource budget as described with reference to
At operation 520, the system allocates the computing resources to the set of users based on the resource pricing information. In some cases, the operations of this step refer to, or may be performed by, a resource allocation component as described with reference to
A method for utilization recommendation is described. One or more aspects of the method include receiving utilization data for computing resources shared by a plurality of users; identifying resource pricing information using a pricing agent based on the utilization data; providing a computing resource budget to each of the plurality of users based on the resource pricing information; generating utilization recommendations for each of the plurality of users based on the resource pricing information and the computing resource budget; receiving resource requests from one or more of the plurality of users in response to the utilization recommendations; and allocating the computing resources to the plurality of users based on the resource requests.
Some examples of the method and apparatus further include generating a time series of resource utilization for the plurality of users based on the utilization data. Some examples further include updating the pricing agent is based on the time series. Some examples of the method and apparatus further include identifying a utilization value for a time period based on the utilization data. Some examples further include computing a reward for the time period based on the utilization value. Some examples further include updating the pricing agent using a reinforcement learning model based on the reward.
Some examples of the method and apparatus further include selecting a resource price for a time period from a plurality of candidate resource prices, wherein the pricing information comprises the resource price. Some examples of the method and apparatus further include deducting a price value from the resource budget of a user based on the allocation of the computing resources and the resource pricing information.
Some examples of the method and apparatus further include determining that the resource request exceeds a remaining amount of a resource budget of a user. Some examples further include refraining from providing the computing resources to the user based on the determination.
At operation 605, the system receives utilization data for computing resources shared by a set of users. In some cases, the operations of this step refer to, or may be performed by, a utilization data component as described with reference to
At operation 610, the system identifies resource pricing information using a pricing agent based on the utilization data. In some cases, the operations of this step refer to, or may be performed by, a pricing agent as described with reference to
At operation 615, the system provides a computing resource budget to each of the set of users based on the resource pricing information. In some cases, the operations of this step refer to, or may be performed by, a resource allocation component as described with reference to
For example, each user i in the set of users may have a computing resource budget B for use in resource allocation. The budget may directly correspond to a non-periodic payment into a user account balance (where, for example, each credit in the user account equates to having a credit available in the computing resource budget), or may correspond to a budget that is determined on a periodic basis (where, for example, a user is given a budget of ten credits per month), or may correspond to another appropriate form of budgeting. The resource pricing information corresponds to these credits, and a user's budget is debited (e.g., a computing budget resource of a user is decreased by pricet×DRESit) when a computing resource is allocated to the user. The resource allocation component may track the computing resource budget of each user and provide the computing resource budget to the user via a user device.
At operation 620, the system generates utilization recommendations for each of the set of users based on the resource pricing information and the computing resource budget. In some cases, the operations of this step refer to, or may be performed by, a utilization recommender as described with reference to
For example, the utilization recommender may generate utilization recommendations UR for a user i at period t as a vector of dimensionality (1+max resources a user can ask for):
where j∈{0, maximum computing resources available to a user} and pred_yit is the predicted computing resource utilization by a user i at a time t.
In some embodiments, pred_yit is calculated to be equal to yit. In some embodiments, pred_yit is calculated to be yit plus a permanent heterogeneity variable ηi distributed as ηi˜(0,1).
The utilization recommender may provide each utilization recommendation to each user in the set of users via a user device.
At operation 625, the system receives resource requests from one or more of the set of users in response to the utilization recommendations. In some cases, the operations of this step refer to, or may be performed by, a resource allocation component as described with reference to
For example, a user may request to be allocated computing resources through a user device. The user request may be based on whether the user can “afford” the computing resources given their budget. The resource allocation component may calculate the affordability of the computing resources and provide that information to the user i:
AFFORDit={j∈{0, max resources}:j·pricet≤bugetit} (7)
In some cases, the resource allocation component may calculate a probability PROB that a given user will request j computing resources:
where
and the ½ is a coefficient that is instead an estimated parameter in some embodiments. The resource allocation component may use the probability that a user will request computing resources to anticipate the user resource request.
At operation 630, the system allocates the computing resources to the set of users based on the resource requests. In some cases, the operations of this step refer to, or may be performed by, a resource allocation component as described with reference to
where ϵit˜Uniform(−10, 10) and
For example, the resource allocation component may allocate the computing resources to the set of users as described with reference to
At operation 705, the system computes a reward for the time period based on the utilization value. In some cases, the operations of this step refer to, or may be performed by, a training component as described with reference to
At operation 710, the system updates the pricing agent using a reinforcement learning model based on the reward. In some cases, the operations of this step refer to, or may be performed by, a training component as described with reference to
The description and drawings described herein represent example configurations and do not represent all the implementations within the scope of the claims. For example, the operations and steps may be rearranged, combined or otherwise modified. Also, structures and devices may be represented in the form of block diagrams to represent the relationship between components and avoid obscuring the described concepts. Similar components or features may have the same name but may have different reference numbers corresponding to different figures.
Some modifications to the disclosure may be readily apparent to those skilled in the art, and the principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein, but is to be accorded the broadest scope consistent with the principles and novel features disclosed herein.
The described methods may be implemented or performed by devices that include a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof. A general-purpose processor may be a microprocessor, a conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration). Thus, the functions described herein may be implemented in hardware or software and may be executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored in the form of instructions or code on a computer-readable medium.
Computer-readable media includes both non-transitory computer storage media and communication media including any medium that facilitates transfer of code or data. A non-transitory storage medium may be any available medium that can be accessed by a computer. For example, non-transitory computer-readable media can comprise random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), compact disk (CD) or other optical disk storage, magnetic disk storage, or any other non-transitory medium for carrying or storing data or code.
Also, connecting components may be properly termed computer-readable media. For example, if code or data is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology such as infrared, radio, or microwave signals, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technology are included in the definition of medium. Combinations of media are also included within the scope of computer-readable media.
In this disclosure and the following claims, the word “or” indicates an inclusive list such that, for example, the list of X, Y, or Z means X or Y or Z or XY or XZ or YZ or XYZ. Also the phrase “based on” is not used to represent a closed set of conditions. For example, a step that is described as “based on condition A” may be based on both condition A and condition B. In other words, the phrase “based on” shall be construed to mean “based at least in part on.” Also, the words “a” or “an” indicate “at least one.”