The present dis relates to packet flow and particularly to a key performance indicator based scheduler that enables a dynamic matching of work items to cores in a given user plane, such as a container or virtual machine.
In the 5G Next Generation Mobile Core, the user plane (UP) needs to have low latency and very high throughput. The user plane also needs to efficiently use CPU (central processing unit) resources based on the workload, be it DPI (deep packet inspection), TCP (transmission control protocol) optimizations, and so forth. Current packet forwarders in a Network Function Virtualization architecture, like OVS-DPDK (Open vSwitch Data Plane Development Kit) or VPP (vector data processing), have a static binding of cores to work items. User space packet scheduling within a container or virtual machine is contained within a process boundary and cannot therefore dynamically allocate more CPU resources to processes within the container or virtual machine.
Forwarders in the NFVI (network function virtualization infrastructure) like VPP and OVS-DPDK do not have a scalable packet scheduler and throughputs are limited by static assignment of cores to certain ports. Even L2-L3 or L4-L7 processing work items are statically bound to certain cores. The result of this static binding is an uneven load distribution across CPU resources. Accordingly, statically allocating CPU resources can result in a waste of CPU resources.
In order to describe the manner in which the above-recited and other advantages and features of the disclosure can be obtained, a more particular description of the principles briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only exemplary embodiments of the disclosure and are not therefore to be considered to be limiting of its scope, the principles herein are described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Various embodiments of the disclosure are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the disclosure.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or can be learned by practice of the herein disclosed principles. The features and advantages of the disclosure can be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the disclosure will become more fully apparent from the following description and appended claims, or can be learned by the practice of the principles set forth herein.
Disclosed is a method that includes periodically observing packets in a user plane according to at least one key performance indicator in a configuration file to yield an observation, wherein the observation represents a closed-loop demand of resources within the user plane. The method includes adjusting, via a scheduler in the user plane and based on the observation, a binding of cores to work items. The binding between cores and work items is dynamic and changeable to improve performance. The at least one key performance indicator can include one or more of a CPU utilization, latency and packet drops. The workload allocations can include work items that are individual scheduleable functions that operation on a queue of packets within the user plane.
In another example, a method is disclosed for providing a dynamic binding of cores to work items within a user plane. The method includes assigning a first number of cores for a first work item within the user plane, assigning a second number of cores to a second work item within the user plan and periodically observing packets in the user plane according to at least one key performance indicator in a configuration file to yield an observation, wherein the observation represents a closed-loop demand of resources within the user plane. The method also includes adjusting, via a scheduler in the user plane and based on the observation, a binding of cores to work items by assigning a third number of cores to the first work item within the user plane and assigning a fourth number of cores to the second work item within the user plane.
The present disclosure addresses the issues raised above. The disclosure provides a system, method and computer-readable storage device embodiments. First a general example system shall be disclosed in
First a general example system shall be disclosed in
To enable user interaction with the computing device 100, an input device 145 can represent any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech and so forth. An output device 135 can also be one or more of a number of output mechanisms known to those of skill in the art. In some instances, multimodal systems can enable a user to provide multiple types of input to communicate with the computing device 100. The communications interface 140 can generally govern and manage the user input and system output. There is no restriction on operating on any particular hardware arrangement and therefore the basic features here may easily be substituted for improved hardware or firmware arrangements as they are developed.
Storage device 130 is a non-volatile memory and can be a hard disk or other types of computer readable media which can store data that are accessible by a computer, such as magnetic cassettes, flash memory cards, solid state memory devices, digital versatile disks, cartridges, random access memories (RAMs) 125, read only memory (ROM) 120, and hybrids thereof.
The storage device 130 can include software services 132, 134, 136 for controlling the processor 110. Other hardware or software modules/services are contemplated. The storage device 130 can be connected to the system connector 105. In one aspect, a hardware module that performs a particular function can include the software component stored in a computer-readable medium in connection with the necessary hardware components, such as the processor 110, connector 105, display 135, and so forth, to carry out the function.
A set of 8 cores is shown as feature 202 in
The scheduler 204 periodically monitors the key performance indicators at fixed or dynamic intervals, such as every 1 second, and decides at a certain time whether to scale up 214 or to scale down 216 work items according to the data in the configuration file 212. The interval of observation is configurable, as well as when, based on a set of observations, a decision is made to either scale up or scale down work items. The intervals can also be dynamic or static. For example, the system could set an interval of observations every second and make a decision every 5 observations or every 5 seconds. The observations may occur at a shorter interval given data associated with the workload such as high CPU usage during a period of time or a scheduled increase in CPU usage. Feedback based on data within the user plane and/or external to the user plane could also be utilized to dynamically set intervals of observation as well as decisions intervals on whether to scale up or scale down work items. In one aspect, adjusting workload allocations and/or resources within a physical or a virtualized forwarding node can be based on a closed loop demand of resources. In other words, the closed loop demand for resources involves the demand for resources within the particular user plane environment, whether physical or virtualized.
In another aspect of this disclosure, the decision-making process in which results can be no change to bindings between cores and work items, a scaling up of work items or a scaling down of work items can further factor in application type to determine resources that work items could scale to. For example, assume in
The scheduler 204, in one aspect, can be considered self contained within the user plane, container or virtual machine. This can mean that the scheduler 204 does not require communication with an external system such as a north-bound orchestration system for a reallocation of work items with cores. In another aspect, the scheduler 204 could incorporate data from a north—bound orchestration system in its reallocation decisions. Furthermore, in another aspect, the number of cores available to the user plane, container or virtual machine could expand to include additional resources based on the analysis of the key performance indicators and the configuration threshold.
The periodicity of monitoring the indicators can also be static, variable, or triggered based on some event. For example, if a large amount of workload is to be scheduled for using the compute environment, the monitoring of the indicators can be increased in frequency, and thus increased in accuracy, for making scaling up or scaling down decisions. A spike in an observation of one of the data points could also trigger a scaling of work items. Further, the system could schedule a monitoring frequency or level of observation in advance of an expected throttling of any given resource or resource group.
In one aspect, adjusting workload allocations includes one or more of: scaling up work items such that additional cores are bound to the work items or scaling down the work items such that cores are unbound to the work items. In this scenario, one set of work items could be scaled up, while another set of work items could be scaled down. Adjusting workload allocations can include adjusting how many cores are assigned to a respective work item within the user plane. The adjusting can also occur at a scheduled number of observations which can be static or dynamic in terms of timing. While cores are referenced, the adjustment can also relate to any other type of compute resource such as memory, bandwidth, and so forth.
The observation can further include an application type, wherein adjusting of the binding of cores to work items is performed based at least in part on application type. The adjusting of the binding of cores in this respect can involve assigning work items associated with a particular application type to a particular core or resource within the user plane. The configuration file can include one or more of a first threshold associated with CPU utilization, a second threshold associated with latency, a third threshold associated with packet drops and an application type. The threshold for any individual data type in the configuration file can be statically set or can be dynamic based on feedback information from the user plane and/or from outside the user plane. An administrator can also set the threshold for any data type.
In the extended scenario, the observation upon which the adjusting is based includes not only data about the key indicators but also data about the application type or workload characteristics. Thus, the adjusting that is performed by the scheduler can take into account the indicators and/or the application type/workload characteristics and assign cores to work items accordingly. In some cases, core characteristics might match a certain application type or workload characteristic than other application types or workload characteristics. In one aspect as well, an application or a workload can broadcast its affinities to the scheduler in advance and adjustments to the scheduler algorithm can occur to further refine and improve the assignment of cores to work items. Further, an application or workload could transmit characteristics of individual work items associated with the workload or application to the scheduler, such as data associated with work items that are network intensive, database intensive, or CPU intensive, for example. This data could then be used to adjust the assignment algorithm for again further refinement of the assignment process.
The first work item and the second work item each can include an individually schedulable function that operates on a queue of packets. The observation can further include an application type, wherein cores of the third number of cores and cores of the fourth number of cores are chosen based at least in part on the application type. The scheduler in the user plane can operate independently or at least in part in coordination with a north-bound orchestration system.
In some embodiments the computer-readable storage devices, mediums, and/or memories can include a cable or wireless signal containing a bit stream and the like. However, when mentioned, non-transitory computer-readable storage media expressly exclude media such as energy, carrier signals, electromagnetic waves, and signals per se.
Methods according to the above-described examples can be implemented using computer-executable instructions that are stored or otherwise available from computer readable media. Such instructions can include, for example, instructions and data which cause or otherwise configure a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Portions of computer resources used can be accessible over a network. The computer executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, firmware, or source code. Examples of computer-readable media that may be used to store instructions, information used, and/or information created during methods according to described examples include magnetic or optical disks, flash memory, USB devices provided with non-volatile memory, networked storage devices, and so on.
Devices implementing methods according to these disclosures can include hardware, firmware and/or software, and can take any of a variety of form factors. Typical examples of such form factors include laptops, smart phones, small form factor personal computers, personal digital assistants, rackmount devices, standalone devices, and so on. Functionality described herein also can be embodied in peripherals or add-in cards. Such functionality can also be implemented on a circuit board among different chips or different processes executing in a single device, by way of further example.
The instructions, media for conveying such instructions, computing resources for executing them, and other structures for supporting such computing resources are means for providing the functions described in these disclosures.
Although a variety of examples and other information was used to explain aspects within the scope of the appended claims, no limitation of the claims should be implied based on particular features or arrangements in such examples, as one of ordinary skill would be able to use these examples to derive a wide variety of implementations. Further and although some subject matter may have been described in language specific to examples of structural features and/or method steps, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to these described features or acts. For example, such functionality can be distributed differently or performed in components other than those identified herein. Rather, the described features and steps are disclosed as examples of components of systems and methods within the scope of the appended claims. Moreover, claim language reciting “at least one of” a set indicates that one member of the set or multiple members of the set satisfy the claim.
It should be understood that features or configurations herein with reference to one embodiment or example can be implemented in, or combined with, other embodiments or examples herein. That is, terms such as “embodiment”, “variation”, “aspect”, “example”, “configuration”, “implementation”, “case”, and any other terms which may connote an embodiment, as used herein to describe specific features or configurations, are not intended to limit any of the associated features or configurations to a specific or separate embodiment or embodiments, and should not be interpreted to suggest that such features or configurations cannot be combined with features or configurations described with reference to other embodiments, variations, aspects, examples, configurations, implementations, cases, and so forth. In other words, features described herein with reference to a specific example (e.g., embodiment, variation, aspect, configuration, implementation, case, etc.) can be combined with features described with reference to another example. Precisely, one of ordinary skill in the art will readily recognize that the various embodiments or examples described herein, and their associated features, can be combined with each other.
A phrase such as an “aspect” does not imply that such aspect is essential to the subject technology or that such aspect applies to all configurations of the subject technology. A disclosure relating to an aspect may apply to all configurations, or one or more configurations. A phrase such as an aspect may refer to one or more aspects and vice versa. A phrase such as a “configuration” does not imply that such configuration is essential to the subject technology or that such configuration applies to all configurations of the subject technology. A disclosure relating to a configuration may apply to all configurations, or one or more configurations. A phrase such as a configuration may refer to one or more configurations and vice versa. The word “exemplary” is used herein to mean “serving as an example or illustration.” Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs.
Moreover, claim language reciting “at least one of” a set indicates that one member of the set or multiple members of the set satisfy the claim. For example, claim language reciting “at least one of A, B, and C” or “at least one of A, B, or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.