Deployment of Software Defined Networking (SDN) and Network Function Virtualization (NFV) has also seen rapid growth in the past few years. Under SDN, the system that makes decisions about where traffic is sent (the control plane) is decoupled for the underlying system that forwards traffic to the selected destination (the data plane). SDN concepts may be employed to facilitate network virtualization, enabling service providers to manage various aspects of their network services via software applications and APIs (Application Program Interfaces). Under NFV, by virtualizing network functions as software applications, network service providers can gain flexibility in network configuration, enabling significant benefits including optimization of available bandwidth, cost savings, and faster time to market for new services.
NFV decouples software (SW) from the hardware (HW) platform. By virtualizing hardware functionality, it becomes possible to run various network functions on standard servers, rather than purpose built HW platform. Under NFV, software-based network functions run on top of a physical network input-output (IO) interface, such as by NIC (Network Interface Controller), using hardware functions that are virtualized using a virtualization layer (e.g., a Type-1 or Type-2 hypervisor or a container virtualization layer).
A goal of NFV is to be able to place multiple VNFs (Virtualized Network Functions) on a single platform and have them run side-by-side in an optimal way without disrupting each other; adding more traditional workload that run next to those VNF's is another significant goal of the industry. However, these goals have been elusive to obtain in practice.
With an ever growing number of VNFs that run on a variety of infrastructures (for example VMware, KVM, OpenStack, Kubernetes, OpenShift) it becomes very difficult for integrators to understand the effects of running multiple VNF's and workloads may have on each other in regards to meeting service level agreement (SLA's), attesting to the security posture of the platform and workloads and such. One result of these difficulties is that the norm in the industry is to run a single VNF appliance on a single platform, which results in increased inter-platform communication, increased platform costs, and reduced resource utilization.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same becomes better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein like reference numerals refer to like parts throughout the various views unless otherwise specified:
Embodiments of methods and apparatus for workload feedback mechanisms facilitating a closed loop architecture are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
For clarity, individual components in the Figures herein may also be referred to by their labels in the Figures, rather than by a particular reference number. Additionally, reference numbers referring to a particular type of component (as opposed to a particular component) may be shown with a reference number followed by “(typ)” meaning “typical.” It will be understood that the configuration of these components will be typical of similar components that may exist but are not shown in the drawing Figures for simplicity and clarity or otherwise similar components that are not labeled with separate reference numbers. Conversely, “(typ)” is not to be construed as meaning the component, element, etc. is typically used for its disclosed function, implement, purpose, etc.
An area of growing interest to cloud service providers, customers, and equipment vendors is the use of platform telemetry to help in analyzing the interactions of multiple workloads. Examples of this include the Intel® Performance Monitoring Unit (Intel® PMU) and Intel® Resource Director Technology (Intel® RDT) telemetry capabilities; which expose a great deal of telemetry on a per-core basis that includes, but not limited to how much of the various cache levels are being utilized by the core, cache misses, hits, memory bandwidth and much more. Other processor vendors, such as AMD® and ARM®-based processor vendors have likewise introduced telemetry capabilities.
Under aspects of the embodiments disclose herein, the workloads themselves participate in publishing the metrics by which they are affected most to a host telemetry microservice. This microservice is specific to the VNF and carries out the correlation between the telemetry specific to the workload, platform PMU metrics and the indicators. This indicator is then sent to an analytic system that analyze it along with overall platform PMU and makes appropriate recommendation to a management/orchestration entity (e.g., MANO) like suggesting MANO to spawn additional service or migrating them.
Recent activities have shown that CPU Core frequencies can be scaled in order to achieve significant power savings for a specific DPDK (Dataplane Development Kit) based workload; the standard operating system (OS)-based frequency managers do not work for DPDK applications because the core is always at 100% utilization by the nature of DPDK Poll mode driver (DPDK PMDs). Under embodiments and implementation is able to detect actual business of the DPDK PMD based upon some PMU telemetry; so in this instance the core frequency can be scaled based upon PMU telemetry data in order to save power, which is of significant importance to some VNF customers.
Today the platform telemetry collection mechanism most commonly used is collectd, and, accordingly, in one embodiment telemetry data collection mechanism 112 uses collectd. Collectd uses plugins for collecting a configurable number of metrics from server platforms and publishes the collected metrics to an analytics component, such as data analytics block 114. The analytics component uses the telemetry information in conjunction with the application telemetry (e.g., VNF telemetry 109) to potentially make changes to the platform (such as core frequency scaling or cache allocation) or to indicate to a scheduler to move a workload, for example.
To achieve the targeted level of automation, the workload/application/VNF participates in the telemetry exposure process. With as simple of a telemetry indication of ‘Meeting SLA’ or ‘Not Meeting SLA’ (e.g., as represented by a ‘1’ or ‘0’), an analytic component will be able to analyze platform and OS telemetry to attempt to find the optimal conditions for the workload. If the telemetry provided by the workload can provide additional reasons as to why it may or may not be meeting SLA's then the analytic component may be able to do an even better job at narrowing down the corresponding platform telemetry.
Generally, the particular mechanisms by which telemetry and associated data are exposed and in what form the data are exposed is beyond the scope of this disclosure. One or more known mechanisms may be implemented, which may further employ secure network connections and/or out-of-band connection. Platform capabilities such as Hardware Queue Manager (HQM) may also be employed.
As shown in
Deployment architecture 300 generally operates as follows. During ongoing operations, platform telemetry data, such as PMU metrics, Intel® Resource Director Technology (RDT), reliability, availability, serviceability (RAS) data, libvirt data (for Linux platforms), etc., are collected from various telemetry sources by platform telemetry monitor 314 and published to analytics system 316. SLA monitor and analytics component 312 monitors the SLA metrics for VNV 310 and reports the VNF SLA violations to analytics system 316. Analytics system 316 performs data analytics to determine a correlation of VNF SLA violations and platform causes to determine a platform configuration adjustment recommendation, which is provided as an input to management system 318. Management system 318 then provides control inputs to server platform 302 to effect adjustment of the operational configuration of one or more hardware components, such as increasing core frequencies.
Generally, SLA monitor and local analytics component 312 can be implemented as software or hardware or a combination of both. In one embodiment, SLA monitor and local analytics component 312 comprises a host telemetry microservice. In one embodiment, SLA monitor and local analytics component 312 1) receives SLA analytics descriptor 320; 2) periodically monitors VNF metrics based on the rules provided by the descriptor; 3) forwards SLA violations when detected to analytics system 316; and 4) accepts changes to the analytics descriptor in the case of scaling events or other management requested changes.
The VNF SLA violation indicator provides insights that the VNF is operating normally and meeting SLA status or failing to meet SLA status. Optionally, the VNF violation indicator may provide an indication of how well the SLA is being met. For example: 98% SLA compliance. As VNF 310 scales in/out or up/down, management system 318 can issue an SLA analytics descriptor configuration update to SLA monitor and local analytics component 312 or the host telemetry microservice, which will apply the new rules to determine SLA compliance.
Deployment architecture 400 includes a Kubernetes node 402 implemented on hardware platform 404 and in which an operating system 406, a VNF 410 including an SLA monitor and local analytics component 412 and a platform telemetry monitor 414 are run or deployed. Platform telemetry monitor 414 provides (e.g., via publication) platform telemetry data such as PMU metrics, RDT, RAS, Kubelet, etc. to a data collection monitoring tool 415. Data collection monitoring tool 415 also receives information identifying VNF SLA violations from SLA monitor and analytics component 412. Data collection monitoring tool 415 makes these data available to an analytics system 416, which performs analytics analysis on these data and outputs a platform configuration adjustment recommendation that is sent to a controller 417 in a Kubernetes master 418.
As further shown in
In one embodiment, The SLA Analytics Descriptor 420 represents 1) a Kubernetes custom resource; 2) VNF metrics to monitor; and 3) Thresholds/integration periods and combination rules for analysis and triggers that generate violations. In one embodiment, controller 417 represents 1) Kubernetes custom controller watching SLA Analytics Descriptors; 2) Integrates with Kubernetes Control plane; 3) Location(s) to report violations; 4) Communicates SLA Monitor descriptors to SLA Monitor and local analytics on pod; 5) Updates SLA Monitor descriptors when required; and 6) Logical solution of rules from SLA Analysis Descriptor to identify violations.
SLA Monitor and local analytics component 412 or a host telemetry container is deployed with the application in the pod. For example, it may be deployed as a sidecar container or native component. SLA Monitor and local analytics component 412 or a host telemetry container 1) Receives the SLA Analytics descriptor from controller; 2 Periodically monitors VNF metrics based on the rules provided by the descriptor; 3) Forwards violations to Data Collection & Monitoring tool when detected; and 4) Accepts changes to the analytics descriptor in the case of scaling events or other management requested changes.
As above, the VNF SLA violation indicator provide insights that the VNF is operating normally and meeting SLA status or failing to meet SLA status. Optionally a VNF SLA violation may provide an indication of how well the SLA is being met, such as 98% SLA compliance.
Once the VNF is deployed and the VNF application SLA monitor is configured, the VNF is in service, as shown in a start block 510. In a block 512, SLA violations are detected based on the SLA monitor descriptor parameters. In a block 514, an SLA violation is reported to a management entity and/or or analytics entity. As depicted by the loop back to star block 510, the operations of blocks 512 and 514 are performed in an ongoing matter while the VNF is in service.
Generally, the platform configuration adjustment recommendation may include information to enable management system to address the VNF SLA violation by adjusting platform hardware, such as frequencies of processor cores.
Next, in a decision block 610, a determination is made to whether a change in platform configuration can resolve the problem. If the answer is YES, the logic proceeds to a block 612 in which a relevant platform change to make is determined, followed by making the platform change in a block 614.
Returning to decision block 606, if the SLA is being met the logic proceeds to a decision block 606 to determine whether a platform change can be made while still maintaining the SLA performance criteria (e.g., performance metric(s)). For example, it may be desirable to reduce power consumption by lowering the frequency of one or more processor cores. If the answer to decision block 616 is YES, the logic proceeds to block 612, and the operation of blocks 612 and 614 are performed to determine and implement the platform change. If the answer to decision block 616 is NO, the logic loops back to block 604. Similarly, if it is determined in decision block 610 that a platform change can't be made to resolve the problem (leading to the SLA violation), the logic returns to block 604.
Initially, firewall VNF 710 is deployed by MANO 718 using SLA analytics descriptor 720 in a similar manner to that described above. During ongoing operations, telemetry collector 708 collects telemetry data from platform telemetry 708 and provides (e.g., publishes) the collected telemetry data to analytics system 716. Firewall VNF 710 also provides performance indicia such as an SLA general indication of SLA performance to analytics system 716. Analytics system processed its inputs to produce a platform configured adjustment recommendation that is provided to MANO 718. MANO 718 then provides configuration inputs 722 to adjust the configuration of applicable components on server platform 704.
Under an aspect of the method, workloads themselves participate in publishing the metrics by which they are affected most to the host telemetry microservice (e.g., host telemetry microservice 812). In one embodiment, the host telemetry microservice is specific to the VNF and carries out the correlation between the telemetry specific to the workload, platform PMU metrics and the indicators, as depicted in a block 814. In one embodiment a generic indication of performance is calculated or otherwise determined by host telemetry microservice 812, and forward the generic indication to the analytics system (816) via the VNF (810). The analytics system will analyze the generic indication along with overall platform telemetry data (e.g., PMU metrics) from collectd 808 and makes appropriate recommendation to a management/orchestration entity (e.g., MANO) like suggesting MANO to spawn an additional service or migrating them.
Consider a deployment of firewall VNF 810 that is mainly interested in an ‘SLA violation’ scenario. While deploying the firewall VNF, MANO 818 will deploy host telemetry microservice 812 based on the SLA analytics descriptor (not shown but similar to SLA analytics descriptor 720 in
Generally, the generic indication can be: 1 or 0 (Good or bad) or a number between 0 and 100, to indicate relative performance (e.g., 0% or 100%), or it can be a message related to performance such as ‘Not meeting my SLA’/‘Meeting my SLA’. The generic indication can represent, for example: capacity, throughput, latency, etc. It could also be represented by xml such as <generic performance indication name><Integer range>. Also, the proposed microservice here can be deployed for each new VNF. Optionally, the host telemetry microservice can be ‘generic’ and a service/VNF specific plugin can be installed by the management system. As another option, the correlating of selected NFVI metrics with application metrics and the generic indication operation can also be done in the VNF itself, depending upon the performance sensitivity of the VNF.
Processor 1006 further includes an Input/Output (I/O) interconnect hierarchy, which includes one or more levels of interconnect circuitry and interfaces that are collectively depicted as I/O interconnect & interfaces 1020 for simplicity. Various components and peripheral devices are coupled to processor 1006 via respective interfaces (not all separately shown), including a network interface 1022 and a firmware storage device 1024. In one embodiment, firmware storage device 1024 is connected to IO interconnect via a link 1025, such as an Enhanced Serial Peripheral Interface Bus (eSPI). As an option, firmware storage device 1024 may be operatively coupled to processor 1006 via a platform controller hub (PCH) 1027.
Network interface 1022 is connected to a network 1030, such as a local area network (LAN), private network, or similar network within a data center. For example, various types of data center architectures may be supported including architecture employing server platforms interconnected by network switches such as Top-of-Rack (ToR) switches, as well as disaggregated architectures such as Intel® Corporation's Rack Scale Design architecture.
Platform hardware 1002 may also include a disk drive or solid-state disk (SSD) with controller 1032 in which software components 1034 are stored. Optionally, all or a portion of the software components used to implement the software aspects of embodiments herein may be loaded over a network 1030 accessed by network interface 1022.
The software components illustrated in
As further illustrated in
In one embodiment, PMON 1050 implements Memory Bandwidth Monitoring (MBM). MBM enables multiple VMs, VNFs, or applications to be tracked independently, which provides memory bandwidth monitoring for each running thread simultaneously. Benefits include detection of noisy neighbors, characterization and debugging of performance for bandwidth-sensitive applications, and more effective non-uniform memory access (NUMA)-aware scheduling.
Although some embodiments have been described in reference to particular implementations, other implementations are possible according to some embodiments. Additionally, the arrangement and/or order of elements or other features illustrated in the drawings and/or described herein need not be arranged in the particular way illustrated and described. Many other arrangements are possible according to some embodiments.
In each system shown in a figure, the elements in some cases may each have a same reference number or a different reference number to suggest that the elements represented could be different and/or similar. However, an element may be flexible enough to have different implementations and work with some or all of the systems shown or described herein. The various elements shown in the figures may be the same or different. Which one is referred to as a first element and which is called a second element is arbitrary.
In the description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other. Additionally, “communicatively coupled” means that two or more elements that may or may not be in direct contact with each other, are enabled to communicate with each other. For example, if component A is connected to component B, which in turn is connected to component C, component A may be communicatively coupled to component C using component B as an intermediary component.
An embodiment is an implementation or example of the inventions. Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments, of the inventions. The various appearances “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments.
Not all components, features, structures, characteristics, etc. described and illustrated herein need be included in a particular embodiment or embodiments. If the specification states a component, feature, structure, or characteristic “may”, “might”, “can” or “could” be included, for example, that particular component, feature, structure, or characteristic is not required to be included. If the specification or claim refers to “a” or “an” element, that does not mean there is only one of the element. If the specification or claims refer to “an additional” element, that does not preclude there being more than one of the additional element.
Italicized letters, such as ‘n’ and ‘M’, etc. in the foregoing detailed description are used to depict an integer number, and the use of a particular letter is not limited to particular embodiments. Moreover, the same letter may be used in separate claims to represent separate integer numbers, or different letters may be used. In addition, use of a particular letter in the detailed description may or may not match the letter used in a claim that pertains to the same subject matter in the detailed description.
As discussed above, various aspects of the embodiments herein may be facilitated by corresponding software and/or firmware components and applications, such as software and/or firmware executed by a processor or the like. Thus, embodiments of this invention may be used as or to support a software program, software modules, firmware, and/or distributed software executed upon some form of processor, processing core or embedded logic a virtual machine running on a processor or core or otherwise implemented or realized upon or within a non-transitory computer-readable or machine-readable storage medium. A non-transitory computer-readable or machine-readable storage medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a non-transitory computer-readable or machine-readable storage medium includes any mechanism that provides (i.e., stores and/or transmits) information in a form accessible by a computer or computing machine (e.g., computing device, electronic system, etc.), such as recordable/non-recordable media (e.g., read only memory (ROM), random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, etc.). The content may be directly executable (“object” or “executable” form), source code, or difference code (“delta” or “patch” code). A non-transitory computer-readable or machine-readable storage medium may also include a storage or database from which content can be downloaded. The non-transitory computer-readable or machine-readable storage medium may also include a device or product having content stored thereon at a time of sale or delivery. Thus, delivering a device with stored content, or offering content for download over a communication medium may be understood as providing an article of manufacture comprising a non-transitory computer-readable or machine-readable storage medium with such content described herein.
Various components referred to above as processes, servers, or tools described herein may be a means for performing the functions described. The operations and functions performed by various components described herein may be implemented by software running on a processing element, via embedded hardware or the like, or any combination of hardware and software. Such components may be implemented as software modules, hardware modules, special-purpose hardware (e.g., application specific hardware, ASICs, DSPs, etc.), embedded controllers, hardwired circuitry, hardware logic, etc. Software content (e.g., data, instructions, configuration information, etc.) may be provided via an article of manufacture including non-transitory computer-readable or machine-readable storage medium, which provides content that represents instructions that can be executed. The content may result in a computer performing various functions/operations described herein.
As used herein, a list of items joined by the term “at least one of” can mean any combination of the listed terms. For example, the phrase “at least one of A, B or C” can mean A; B; C; A and B; A and C; B and C; or A, B and C.
The above description of illustrated embodiments of the invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize.
These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification and the drawings. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.