This disclosure relates generally to network services and, more particularly, to methods, systems, articles of manufacture and apparatus for network service management.
In recent years, Edge networking environments have experienced increasing demands for services from different portions of the Edge networking environment, such as services from cloud service providers. Typically, services are decomposed into modular services referred to as “microservices.” Additionally, microservices may be containerized to operate in a more “self contained” manner when needed.
In general, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts. The figures are not to scale.
As used in this patent, stating that any part (e.g., a layer, film, area, region, or plate) is in any way on (e.g., positioned on, located on, disposed on, or formed on, etc.) another part, indicates that the referenced part is either in contact with the other part, or that the referenced part is above the other part with one or more intermediate part(s) located therebetween.
As used herein, connection references (e.g., attached, coupled, connected, and joined) may include intermediate members between the elements referenced by the connection reference and/or relative movement between those elements unless otherwise indicated. As such, connection references do not necessarily infer that two elements are directly connected and/or in fixed relation to each other. As used herein, stating that any part is in “contact” with another part is defined to mean that there is no intermediate part between the two parts.
As used herein, the phrase “in communication,” including variations thereof, encompasses direct communication and/or indirect communication through one or more intermediary components, and does not require direct physical (e.g., wired) communication and/or constant communication, but rather additionally includes selective communication at periodic intervals, scheduled intervals, aperiodic intervals, and/or one-time events.
As used herein, “processor circuitry” is defined to include (i) one or more special purpose electrical circuits structured to perform specific operation(s) and including one or more semiconductor-based logic devices (e.g., electrical hardware implemented by one or more transistors), and/or (ii) one or more general purpose semiconductor-based electrical circuits programmed with instructions to perform specific operations and including one or more semiconductor-based logic devices (e.g., electrical hardware implemented by one or more transistors). Examples of processor circuitry include programmed microprocessors, Field Programmable Gate Arrays (FPGAs) that may instantiate instructions, Central Processor Units (CPUs), Graphics Processor Units (GPUs), Digital Signal Processors (DSPs), XPUs, or microcontrollers and integrated circuits such as Application Specific Integrated Circuits (ASICs). For example, an XPU may be implemented by a heterogeneous computing system including multiple types of processor circuitry (e.g., one or more FPGAs, one or more CPUs, one or more GPUs, one or more DSPs, etc., and/or a combination thereof) and application programming interface(s) (API(s)) that may assign computing task(s) to whichever one(s) of the multiple types of the processing circuitry is/are best suited to execute the computing task(s).
Edge services are not always deterministic. While some tasks can be predicted in advance of their need (e.g., a time-of-day expectation for increased traffic), other tasks may become needed without any advanced warning. In an effort to combat this uncertainty, typical microservice management systems in Edge networks require at least one instance of a particular microservice running in at least one Edge node of the Edge network(s). As used herein, a “microservice” is a decomposed (e.g., divided, split, allocated, etc.) service that would otherwise operate in a monolithic architecture. Unlike a monolithic architecture that bundles a large number of services into an application (app), and aggregates data storage into a single monolithic data source (e.g., a database) from which numerous apps and services rely, microservices are decomposed into smaller and/or otherwise more responsive services that, in some examples, have their own data storage and app framework. Microservices may include one or more functions that can be invoked and executed when the microservice is also instantiated and executing. Because microservices are relatively more nimble than their monolithic counterparts, individual microservices can be “spun-up” and instantiated faster than monolithic architectures. Microservices typically consume a smaller form factor or footprint, such as a smaller amount of consumed memory (e.g., an executable or bit stream stored in memory, disk storage, etc.). The relatively smaller form factor facilitates faster instantiation times and less bandwidth consumption. In some instances, examples disclosed herein apply to Edge networking services in the same manner as microservices. In other words, concepts disclosed herein to manage “microservices” may apply to concepts corresponding to “services.”
Of course, depending on a particular load at any particular time, some services (e.g., microservices) are not needed and/or are not performing any useful tasks, yet still consuming a portion of node resources to remain ready for invocation. In some examples, microservices are instantiated so that they can either execute one or more desired tasks/functions (e.g., an active phase) or remain in hibernation (e.g., a dormant or hibernated phase in which the microservice binary is stored in disk storage, or stored in relatively faster cache memory in case it is expected to be needed sooner or lower latency is required for function execution). In some examples, services can also be hibernated. In some examples, particular services include a footprint that is too large for a relatively fast cache memory and, as such, are not candidates for hibernation. To instantiate a microservice, at least one fetch (e.g., a memory fetch for an executable/binary) is required so that the microservice is available in, for example, cache memory. Additionally, while a hibernated microservice can be further instantiated (e.g., brought into an operational/active phase) in a responsive manner because it is in cache memory, resources are still consumed to keep the microservice at some level of readiness. Stated differently, resources are wasted (e.g., in an effort to satisfy service level agreements (SLAs)).
Such waste is compounded as services, networks and/or systems attempt to scale up. While employing microservices and/or containerized microservices is an improvement over monolithic architectures, the aggregated waste of scale up efforts still cannot be ignored. Examples disclosed herein manage microservice execution in a manner that improves responsivity and efficiency.
Compute (e.g., processor cycles), memory, and storage are scarce resources, and may generally decrease depending on the Edge location (e.g., fewer processing resources being available at consumer endpoint devices, than at a base station, than at a central office). However, the closer that the Edge location is to the endpoint (e.g., user equipment (UE)), the more that space and power is often constrained. Thus, Edge computing attempts to reduce the amount of resources needed for network services, through the distribution of more resources which are located closer both geographically and in network access time. In this manner, Edge computing attempts to bring the compute resources to workload data where appropriate, or, bring the workload data to the compute resources. In some examples, a workload includes, but is not limited to executable processes, such as algorithms, machine learning algorithms, image recognition algorithms, gain/loss algorithms, etc.
The following describes aspects of an Edge cloud architecture that covers multiple potential deployments and addresses restrictions that some network operators or service providers may have in their own infrastructures. These include, variation of configurations based on the Edge location (because edges at a base station level, for instance, may have more constrained performance and capabilities in a multi-tenant scenario); configurations based on the type of compute, memory, storage, fabric, acceleration, or like resources available to Edge locations, tiers of locations, or groups of locations; the service, security, and management and orchestration capabilities; and related objectives to achieve usability and performance of end services. These deployments may accomplish processing in network layers that may be considered as “near Edge”, “close Edge”, “local Edge”, “middle Edge”, or “far Edge” layers, depending on latency, distance, and timing characteristics.
Edge computing is a developing paradigm where computing is performed at or closer to the “Edge” of a network, typically through the use of a compute platform (e.g., x86 or ARM compute hardware architecture) implemented at base stations, gateways, network routers, or other devices which are much closer to endpoint devices producing and consuming the data. For example, Edge gateway servers may be equipped with pools of memory and storage resources to perform computation in real-time for low latency use-cases (e.g., autonomous driving or video surveillance) for connected client devices. Or as an example, base stations may be augmented with compute and acceleration resources to directly process service workloads for connected user equipment, without further communicating data via backhaul networks. Or as another example, central office network management hardware may be replaced with standardized compute hardware that performs virtualized network functions and offers compute resources for the execution of services and consumer functions for connected devices. Within Edge computing networks, there may be scenarios in services which the compute resource will be “moved” to the data, as well as scenarios in which the data will be “moved” to the compute resource. Or as an example, base station compute, acceleration and network resources can provide services in order to scale to workload demands on an as needed basis by activating dormant capacity (subscription, capacity on demand) in order to manage corner cases, emergencies or to provide longevity for deployed resources over a significantly longer implemented lifecycle.
Examples of latency, resulting from network communication distance and processing time constraints, may range from less than a millisecond (ms) when among the endpoint layer A200, under 5 ms at the Edge devices layer A210, to even between 10 to 40 ms when communicating with nodes at the network access layer A220. Beyond the Edge cloud A110 are core network A230 and cloud data center A240 layers, each with increasing latency (e.g., between 50-60 ms at the core network layer A230, to 100 or more ms at the cloud data center layer). As a result, operations at a core network data center A235 or a cloud data center A245, with latencies of at least 50 to 100 ms or more, will not be able to accomplish many time-critical functions of the use cases A205. Each of these latency values are provided for purposes of illustration and contrast; it will be understood that the use of other access network mediums and technologies may further reduce the latencies. In some examples, respective portions of the network may be categorized as “close Edge”, “local Edge”, “near Edge”, “middle Edge”, or “far Edge” layers, relative to a network source and destination. For instance, from the perspective of the core network data center A235 or a cloud data center A245, a central office or content data network may be considered as being located within a “near Edge” layer (“near” to the cloud, having high latency values when communicating with the devices and endpoints of the use cases A205), whereas an access point, base station, on-premise server, or network gateway may be considered as located within a “far Edge” layer (“far” from the cloud, having low latency values when communicating with the devices and endpoints of the use cases A205). It will be understood that other categorizations of a particular network layer as constituting a “close”, “local”, “near”, “middle”, or “far” Edge may be based on latency, distance, number of network hops, or other measurable characteristics, as measured from a source in any of the network layers A200-A240.
The various use cases A205 may access resources under usage pressure from incoming streams, due to multiple services utilizing the Edge cloud. To achieve results with low latency, the services executed within the Edge cloud A110 balance varying requirements in terms of: (a) Priority (throughput or latency) and Quality of Service (QoS) (e.g., traffic for an autonomous car may have higher priority than a temperature sensor in terms of response time requirement; or, a performance sensitivity/bottleneck may exist at a compute/accelerator, memory, storage, or network resource, depending on the application); (b) Reliability and Resiliency (e.g., some input streams need to be acted upon and the traffic routed with mission-critical reliability, where as some other input streams may be tolerate an occasional failure, depending on the application); and (c) Physical constraints (e.g., power, cooling and form-factor).
The end-to-end service view for these use cases involves the concept of a service-flow and is associated with a transaction. The transaction details the overall service requirement for the entity consuming the service, as well as the associated services for the resources, workloads, workflows, and business functional and business level requirements. The services executed with the “terms” described may be managed at each layer in a way to assure real time, and runtime contractual compliance for the transaction during the lifecycle of the service. When a component in the transaction is missing its agreed to service level agreement (SLA), the system as a whole (components in the transaction) may provide the ability to (1) understand the impact of the SLA violation, and (2) augment other components in the system to resume overall transaction SLA, and (3) implement steps to remediate. In some examples, an SLA is an agreement, commitment and/or contract between entities. The SLA may include parameters (e.g., latency) and corresponding values (e.g., time in milliseconds) that must be satisfied before the SLA is deemed in compliance or not.
Thus, with these variations and service features in mind, Edge computing within the Edge cloud A110 may provide the ability to serve and respond to multiple applications of the use cases A205 (e.g., object tracking, video surveillance, connected cars, etc.) in real-time or near real-time, and meet ultra-low latency requirements for these multiple applications. These advantages enable a whole new class of applications (Virtual Network Functions (VNFs), Function as a Service (FaaS), Edge as a Service (EaaS), standard processes, etc.), which cannot leverage conventional cloud computing due to latency or other limitations.
However, with the advantages of Edge computing comes the following caveats. The devices located at the Edge are often resource constrained and therefore there is pressure on usage of Edge resources. Typically, this is addressed through the pooling of memory and storage resources for use by multiple users (tenants) and devices. The Edge may be power and cooling constrained and therefore the power usage needs to be accounted for by the applications that are consuming the most power. There may be inherent power-performance tradeoffs in these pooled memory resources, as many of them are likely to use emerging memory technologies, where more power requires greater memory bandwidth. Likewise, improved security of hardware and root of trust trusted functions are also required, because Edge locations may be unmanned and may even need permissioned access (e.g., when housed in a third-party location). Such issues are magnified in the Edge cloud A110 in a multi-tenant, multi-owner, or multi-access setting, where services and applications are requested by many users, especially as network usage dynamically fluctuates and the composition of the multiple stakeholders, use cases, and services changes.
At a more generic level, an Edge computing system may be described to encompass any number of deployments at the previously discussed layers operating in the Edge cloud A110 (network layers A200-A240), which provide coordination from client and distributed computing devices. One or more Edge gateway nodes, one or more Edge aggregation nodes, and one or more core data centers may be distributed across layers of the network to provide an implementation of the Edge computing system by or on behalf of a telecommunication service provider (“telco”, or “TSP”), internet-of-things service provider, cloud service provider (CSP), enterprise entity, or any other number of entities. Various implementations and configurations of the Edge computing system may be provided dynamically, such as when orchestrated to meet service objectives.
Consistent with the examples provided herein, a client compute node may be embodied as any type of endpoint component, device, appliance, or other thing capable of communicating as a producer or consumer of data. Further, the label “node” or “device” as used in the Edge computing system does not necessarily mean that such node or device operates in a client or agent/minion/follower role; rather, any of the nodes or devices in the Edge computing system refer to individual entities, nodes, or subsystems which include discrete or connected hardware or software configurations to facilitate or use the Edge cloud A110.
As such, the Edge cloud A110 is formed from network components and functional features operated by and within Edge gateway nodes, Edge aggregation nodes, or other Edge compute nodes among network lasers A210-A230. The Edge cloud A110 thus may be embodied as any type of network that provides Edge computing and/or storage resources which are proximately located to radio access network (RAN) capable endpoint devices (e.g., mobile computing devices, IoT devices, smart devices, etc.), which are discussed herein. In other words, the Edge cloud A110 may be envisioned as an “Edge” which connects the endpoint devices and traditional network access points that serve as an ingress point into service provider core networks, including mobile carrier networks (e.g., Global System for Mobile Communications (GSM) networks, Long-Term Evolution (LTE) networks, 5G/6G networks, etc.), while also providing storage and/or compute capabilities. Other types and forms of network access (e.g., Wi-Fi, long-range wireless, wired networks including optical networks) may also be utilized in place of or in combination with such 3GPP carrier networks.
The network components of the Edge cloud A110 may be servers, multi-tenant servers, appliance computing devices, and/or an other type of computing devices. For example, the Edge cloud A110 may include an appliance computing device that is a self-contained electronic device including a housing, a chassis, a case or a shell. In some circumstances, the housing may be dimensioned for portability such that it can be carried by a human and/or shipped. Example housings may include materials that form one or more exterior surfaces that partially or fully protect contents of the appliance, in which protection may include weather protection, hazardous environment protection (e.g., EMI, vibration, extreme temperatures), and/or enable submergibility. Example housings may include power circuitry to provide power for stationary and/or portable implementations, such as AC power inputs, DC power inputs, AC/DC or DC/AC converter(s), power regulators, transformers, charging circuitry, batteries, wired inputs and/or wireless power inputs. Example housings and/or surfaces thereof may include or connect to mounting hardware to enable attachment to structures such as buildings, telecommunication structures (e.g., poles, antenna structures, etc.) and/or racks (e.g., server racks, blade mounts, etc.). Example housings and/or surfaces thereof may support one or more sensors (e.g., temperature sensors, vibration sensors, light sensors, acoustic sensors, capacitive sensors, proximity sensors, etc.). One or more such sensors may be contained in, carried by, or otherwise embedded in the surface and/or mounted to the surface of the appliance. Example housings and/or surfaces thereof may support mechanical connectivity, such as propulsion hardware (e.g., wheels, propellers, etc.) and/or articulating hardware (e.g., robot arms, pivotable appendages, etc.). In some circumstances, the sensors may include any type of input devices such as user interface hardware (e.g., buttons, switches, dials, sliders, etc.). In some circumstances, example housings include output devices contained in, carried by, embedded therein and/or attached thereto. Output devices may include displays, touchscreens, lights, LEDs, speakers, I/O ports (e.g., USB), etc. In some circumstances. Edge devices are devices presented in the network for a specific purpose (e.g., a traffic light), but may have processing and/or other capacities that may be utilized for other purposes. Such Edge devices may be independent from other networked devices and may be provided with a housing having a form factor suitable for its primary purpose; yet be available for other compute tasks that do not interfere with its primary task Edge devices include Internet of Things devices. The appliance computing device may include hardware and software components to manage local issues such as device temperature, vibration, resource utilization, updates, power issues, physical and network security, etc. Example hardware for implementing an appliance computing device is described in conjunction with
In
In the example of
It should be understood that some of the devices in B110 are multi-tenant devices where Tenant 1 may function within a tenant1 ‘slice’ while a Tenant 2 may function within a tenant2 slice (and, in further examples, additional or sub-tenants may exist; and each tenant may even be specifically entitled and transactionally tied to a specific set of features all the way day to specific hardware features). A trusted multi-tenant device may further contain a tenant specific cryptographic key such that the combination of key and slice mas be considered a “root of trust” (RoT) or tenant specific RoT. A RoT may further be computed dynamically composed using a DICE (Device Identity Composition Engine) architecture such that a single DICE hardware building block may be used to construct layered trusted computing base contexts for layering of device capabilities (such as a Field Programmable Gate Array (FPGA)). The RoT may further be used for a trusted computing context to enable a “fan-out” that is useful for supporting multi-tenancy. Within a multi-tenant environment, the respective Edge nodes B122, B124 may operate as security feature enforcement points for local resources allocated to multiple tenants per node. Additionally, tenant runtime and application execution (e.g., in instances B132, B134) may serve as an enforcement point for a security feature that creates a virtual Edge abstraction of resources spanning potentially multiple physical hosting platforms. Finally, the orchestration functions B160 at an orchestration entity may operate as a security feature enforcement point for marshalling resources along tenant boundaries.
Edge computing nodes may partition resources (memory, central processing unit (CPU), graphics processing unit (GPU), interrupt controller, input/output (I/O) controller, memory controller, bus controller, etc.) where respective partitionings may contain a RoT capability and where fan-out and layering according to a DICE model may further be applied to Edge Nodes. Cloud computing nodes often use containers, FaaS engines, servlets, servers, or other computation abstraction that may be partitioned according to a DICE layering and fan-out structure to support a RoT context for each. Accordingly, the respective RoTs spanning devices B110, B122, and B140 may coordinate the establishment of a distributed trusted computing base (DTCB) such that a tenant-specific virtual trusted secure channel linking all elements end to end can be established.
Further, it will be understood that a container may have data or workload specific keys protecting its content from a previous Edge node. As part of migration of a container, a pod controller at a source Edge node may obtain a migration key from a target Edge node pod controller where the migration key is used to wrap the container-specific keys. When the container/pod is migrated to the target Edge node, the unwrapping key is exposed to the pod controller that then decrypts the wrapped keys. The keys may now be used to perform operations on container specific data. The migration functions may be gated by properly attested Edge nodes and pod managers (as described above).
In further examples, an Edge computing system is extended to provide for orchestration of multiple applications through the use of containers (a contained, deployable unit of software that provides code and needed dependencies) in a multi-owner, multi-tenant environment. A multi-tenant orchestrator may be used to perform key management, trust anchor management, and other security functions related to the provisioning and lifecycle of the trusted ‘slice’ concept in
For instance, each Edge node B122, B124 may implement the use of containers, such as with the use of a container “pod” B126, B128 providing a group of one or more containers. In a setting that uses one or more container pods, a pod controller or orchestrator is responsible for local control and orchestration of the containers in the pod. Various Edge node resources (e.g., storage, compute, services, depicted with hexagons) provided for the respective Edge slices B132, B134 are partitioned according to the needs of each container.
With the use of container pods, a pod controller oversees the partitioning and allocation of containers and resources. The pod controller receives instructions from an orchestrator (e.g., orchestrator B160) that instructs the controller on how best to partition physical resources and for what duration, such as by receiving key performance indicator (KPI) targets based on SLA contracts. The pod controller determines which container requires which resources and for how long in order to complete the workload and satisfy the SLA. The pod controller also manages container lifecycle operations such as: creating the container, provisioning it with resources and applications, coordinating intermediate results between multiple containers working on a distributed application together, dismantling containers when workload completes, and the like. Additionally, the pod controller may serve a security role that prevents assignment of resources until the right tenant authenticates or prevents provisioning of data or a workload to a container until an attestation result is satisfied.
Also, with the use of container pods, tenant boundaries can still exist but in the context of each pod of containers. If each tenant specific pod has a tenant specific pod controller, there will be a shared pod controller that consolidates resource allocation requests to avoid typical resource starvation situations. Further controls may be provided to ensure attestation and trustworthiness of the pod and pod controller. For instance, the orchestrator B160 may provision an attestation verification policy to local pod controllers that perform attestation verification. If an attestation satisfies a policy for a first tenant pod controller but not a second tenant pod controller, then the second pod could be migrated to a different Edge node that does satisfy it. Alternatively, the first pod may be allowed to execute and a different shared pod controller is installed and invoked prior to the second pod executing.
The system arrangements depicted in
In the context of
In further examples, aspects of software-defined or controlled silicon hardware, and other configurable hardware, may integrate with the applications, functions, and services an Edge computing system. Software defined silicon (SDSi) may be used to ensure the ability for some resource or hardware ingredient to fulfill a contract or service level agreement, based on the ingredient's ability to remediate a portion of itself or the workload (e.g., by an upgrade, reconfiguration, or provision of new features within the hardware configuration itself).
Furthermore, one or more IPUs can execute platform management, networking stack processing operations, security (crypto) operations, storage software, identity and key management, telemetry, logging, monitoring and service mesh (e.g., control how different microservices communicate with one another). The IPU can access an xPU to offload performance of various tasks. For instance, an IPU exposes XPU, storage, memory, and CPU resources and capabilities as a service that can be accessed by other microservices for function composition. This can improve performance and reduce data movement and latency. An IPU can perform capabilities such as those of a router, load balancer, firewall, TCP/reliable transport, a service mesh (e.g., proxy or API gateway), security, data-transformation, authentication, quality of service (QoS), security, telemetry measurement, event logging, initiating and managing data flows, data placement, or job scheduling of resources on an xPU, storage, memory, or CPU.
In the illustrated example of
In some examples, IPU D200 includes a field programmable gate array (FPGA) D270 structured to receive commands from an CPU, XPU, or application via an API and perform commands/tasks on behalf of the CPU, including workload management and offload or accelerator operations. The illustrated example of
Example compute fabric circuitry D250 provides connectivity to a local host or device (e.g., server or device (e.g., xPU, memory, or storage device)). Connectivity with a local host or device or smartNIC or another IPU is, in some examples, provided using one or more of peripheral component interconnect express (PCIe), ARM AXI, Intel® QuickPath Interconnect (QPI), Intel® Ultra Path Interconnect (UPI), Intel® On-Chip System Fabric (IOSF), Omnipath, Ethernet, Compute Express Link (CXL), HyperTransport, NVLink, Advanced Microcontroller Bus Architecture (AMBA) interconnect, OpenCAPI, Gen-Z, CCIX, Infinity Fabric (IF), and so forth. Different examples of the host connectivity provide symmetric memory and caching to enable equal peering between CPU, XPU, and IPU (e.g., via CXL.cache and CXL.mem).
Example media interfacing circuitry D260 provides connectivity to a remote smartNIC or another IPU or service via a network medium or fabric. This can be provided over any type of network media (e.g., wired or wireless) and using any protocol (e.g., Ethernet, InfiniBand, Fiber channel, ATM, to name a few).
In some examples, instead of the server/CPU being the primary component managing IPU D200, IPU D200 is a root of a system (e.g., rack of servers or data center) and manages compute resources (e.g., CPU, xPU, storage, memory, other IPUs, and so forth) in the IPU D200 and outside of the IPU D200. Different operations of an IPU are described below.
In some examples, the IPU D200 performs orchestration to decide which hardware or software is to execute a workload based on available resources (e.g., services and devices) and considers service level agreements and latencies, to determine whether resources (e.g., CPU, xPU, storage, memory, etc.) are to be allocated from the local host or from a remote host or pooled resource. In examples when the IPU D200 is selected to perform a workload, secure resource managing circuitry D202 offloads work to a CPU, xPU, or other device and the IPU D200 accelerates connectivity of distributed runtimes, reduce latency, CPU and increases reliability.
In some examples, secure resource managing circuitry D202 runs a service mesh to decide what resource is to execute workload, and provide for L7 (application layer) and remote procedure call (RPC) traffic to bypass kernel altogether so that a user space application can communicate directly with the example IPU D200 (e.g., IPU D200 and application can share a memory space). In some examples, a service mesh is a configurable, low-latency infrastructure layer designed to handle communication among application microservices using application programming interfaces (APIs) (e.g., over remote procedure calls (RPCs)). The example service mesh provides fast, reliable, and secure communication among containerized or virtualized application infrastructure services. The service mesh can provide critical capabilities including, but not limited to service discovery, load balancing, encryption, observability, traceability, authentication and authorization, and support for the circuit breaker pattern.
In some examples, infrastructure services include a composite node created by an IPU at or after a workload from an application is received. In some cases, the composite node includes access to hardware devices, software using APIs, RPCs, gRPCs, or communications protocols with instructions such as, but not limited, to iSCSI, NVMe-oF, or CXL.
In some cases, the example IPU D200 dynamically selects itself to run a given workload (e.g., microservice) within a composable infrastructure including an IPU, xPU, CPU, storage, memory, and other devices in a node.
In some examples, communications transit through media interfacing circuitry D260 of the example IPU D200 through a NIC/smartNIC (for cross node communications) or loopback back to a local service on the same host. Communications through the example media interfacing circuitry D260 of the example IPU D200 to another IPU can then use shared memory support transport between xPUs switched through the local IPUs. Use of IPU-to-IPU communication can reduce latency and jitter through ingress scheduling of messages and work processing based on service level objective (SLO).
For example, for a request to a database application that requires a response, the example IPU D200 prioritizes its processing to minimize the stalling of the requesting application. In some examples, the IPU D200 schedules the prioritized message request issuing the event to execute a SQL query database and the example IPU constructs microservices that issue SQL queries and the queries are sent to the appropriate devices or services.
As described above, the non-deterministic nature of Edge networks requires techniques to ensure SLA expectations are met. For example, video analytics use case scenarios may require identification of safety issues in a city. Traditionally, applications would consume one or more video streams from corresponding camera and/or image acquisition devices. Such applications may recompose the streams and perform object and/or person detection so that safety issues can be identified. However, the example use case in an Edge network becomes hardware limited when scale up efforts are attempted. In view of SLA expectations that microservices must be responsive, traditional microservice applications require resource allocation to each microservice and a corresponding instantiation effort. However, when scaling up one thousand or ten thousand fold, hardware limitation become a burdensome barrier when traditional microservices must be instantiated and running, despite no current demand for particular ones of those microservices. Examples disclosed herein improve responsivity and efficiency of managing microservice instantiation and execution.
The example smartNIC 104 includes microservice management circuitry, which is described in further detail below. Such microservice management circuitry includes example intercept logic 108, example hibernation/resume logic 110, example function information 112, and example monitoring logic 114. The example CPU 102 of
In operation, the example smartNIC 102 (which includes structure and/or logic corresponding to the example microservice management circuitry to be discussed in further detail below) invokes the example intercept logic 108 to intercept requests for functions to be executed by one or more microservices. In some examples, the intercept logic 108 utilizes a compute express link (CXL) open standard to facilitate high-speed CPU-to-device (e.g., other CPUs, memory, etc.) communication, but examples disclosed herein are not limited thereto. In particular, the example intercept logic 108 monitors the example doorbell address space 118 for address/memory ranges identified by particular functions requested by an application local to the CPU 102 (e.g., functions requested by the example App 1 of
In some examples, the request for function invocation may not satisfy SLA requirements at the time of invocation. For instance, a prior function execution may have utilized the example CPU 102 to perform one or more tasks, and then placed in a hibernation state to avoid wasting resources during instances of non-use (e.g., stopping the microservice, but storing its state information in cache memory for fast re-retrieval). However, at a second/later time when the same microservice is invoked, the CPU may be inundated with alternate/active tasks, thereby causing SLA performance expectations to drop below threshold levels of acceptance. However, in the event the example CPU 102 includes alternate hardware resources that are capable of performing the functions corresponding to the invoked microservices, the example function information circuitry 112 includes an example transformation table 160. The example transformation table 160 includes an example function identifier column 162, an example resource type column 164 and an example transformation functions column 166. When faced with the threat of not satisfying SLA requirements, the example function information circuitry 112 provides any number of transformation functions to transform a previously targeted first hardware device to employ a different/second hardware device. In the illustrated example of
In some examples, the hibernation/resume logic 110 detects when a request is identified (e.g., via particular memory ranges identified in the request) and, if the desired function is not up, determine where to instantiate the function. As described above, the monitoring logic 114 may monitor the status of any number of available devices (e.g., the CPU 102, DEV 0 and/or DEV N (128)) to aid in the determination of where the function should be instantiated. In some examples, when other microservices are running, yet there has been no demand for one or more functions enabled by the microservices, the example hibernation/resume logic 110 causes one or more microservices to enter a hibernation state, thereby allocating additional computing resources for the node. In some examples, deciding which microservices to hibernate is based on a priority or importance metric of the request (e.g., a priority metric in App 1) or associated SLA information corresponding to a function of interest.
The example monitoring logic 114 also monitors and/or otherwise tracks how often functions are executed, as well as monitoring memory regions where requests are queued. Hibernation decisions can be made to store hibernated functions in persistent memory (for later fast retrieval) or disk storage (e.g., when imminent re-use is not anticipated). Monitored parameters that illustrate a manner of determining which functions (and/or their corresponding microservices) are to be hibernated include, but are not limited to a particular frequency of use, a particular latency metric, or a particular importance (e.g., priority) of the function.
In operation, the example microservice translation circuitry 202 queries the example doorbell address space 118 to obtain information regarding microservice status. As described above, examples disclosed herein may apply to services in some cases, thus the example microservice translation circuitry 202 is not limited to microservices. Microservice status includes information indicative of whether the microservice was ever instantiated in the past, whether the microservice was previously instantiated, but currently hibernated, and whether the microservice is currently running. As described above, each microservice of interest includes a particular memory address range in which data can be toggled and/or otherwise configured to identify a current state of the microservice on the node of interest (e.g., on the CPU 102). Based on the information retrieved from the query, the example microservice translation circuitry 202 updates the microservice state information stored in the example doorbell address space 118. In other words, the query results in obtaining state information for any number of microservices corresponding to the doorbell address space 118. For example, and briefly returning to the illustrated example of
However, also consider that the example microservice management system 100 of
After the example microservice translation circuitry 202 updates the microservice state information, the example microservice hibernation circuitry 204 updates microservice hibernation decisions. As described above, examples disclosed herein may apply to services in some cases, thus the example microservice hibernation circuitry 204 is not limited to microservices. For instance, in the event a particular microservice is not particularly active (e.g., the microservice satisfies one or more criteria indicative of inactivity, such as a frequency of invocation per unit of time), the example microservice hibernation circuitry 204 causes one or more microservices to hibernate. In some examples, the manner of hibernation differs based on different parameters, such as a priority value associated with the microservice. For instance, for microservices that are deemed very critical (e.g., based on SLA parameters) and/or invoked at a relatively high frequency, hibernation can be facilitated using persistent memory so that re-instantiation is fast. On the other hand, for microservices that are deemed less critical and/or invoked at a relatively low frequency, hibernation can be facilitated using disk storage, thereby conserving valuable high-speed memory of the target computing device.
While the above examples illustrate establishing a state of operation for any given node, the example microservice request circuitry 206 monitors for any incoming microservice request. As described above, examples disclosed herein may apply to services in some cases, thus the example microservice request circuitry 206 is not limited to microservices. Incoming microservice requests may occur due to external requests (e.g., Edge node network sourced requests, such as example APP 2), or requests that occur within the Edge node (e.g., APP 1). If no requests are detected based on detecting a difference in one or more address ranges of the example doorbell address space 118, monitoring and updating continues. However, in response to the example microservice request circuitry 206 detecting a microservice request, the example microservice hibernation circuitry 204 determines whether the requested microservice is already instantiated. If not, the example microservice instantiation circuitry 212 instantiates the requested microservice, and the microservice translation circuitry 202 updates the state information in the example doorbell address space 118 to reflect that the microservice is now instantiated. As such, any subsequent for this microservice will be met with a prompt status, thereby allowing the requestor to find an alternate node and/or instantiate the microservice wkith alternate computing resources.
However, if the microservice is already instantiated, as determined by the example microservice hibernation circuitry 204, the example resource discovery circuitry 208 identifies resources that implement the requested microservice. In some examples, the resource discover circuitry 208 examines the node (e.g., the CPU 102 and any communicatively connected computing devices) for computing devices. Results of such examination may identify that the particular node includes a CPU, a particular number of cores of the CPU, a particular availability of the number of cores of the CPU, a GPU, an FPGA, an accelerator, and/or availability of all such computing devices. The example SLA analysis circuitry 210 determines whether the SLA is expected to be satisfied when using the computing device associated with the microservice request. If so, the example microservice instantiation circuitry 212 spins-up and/or otherwise causes the microservice to operate (e.g., in some examples, the microservice was previously instantiated, but in a hibernated state). The example microservice translation circuitry 202 then updates the doorbell state information.
However, in the event the SLA is not expected to be satisfied when implementing the compute resources associated with the microservice request, as determined by the example SLA analysis circuitry 210, the example resource discovery circuitry 208 identifies one or more available compute resources. The example microservice translation circuitry 202 obtains translation binaries from the example transformation table 160 and translates the microservice so that the alternate compute device(s) can be used. Such updates to the microservice instantiation activity are again updated by the microservice translation circuitry 202 to the doorbell address space 118.
While examples disclosed above consider microservice responsivity and efficiency improvements managed by the example microservice management circuitry 200 operating in a smartNIC or an IPU, separate microservice translation circuitry 130, 202 may reside and/or otherwise operate on the example CPU 102 or other compute node within an Edge node device of an Edge network. As described above, microservice requests (and function requests) disclosed herein require a common knowledge of which functions correspond to which memory ranges in the example doorbell address space 118. As such, when the example smartNIC and/or IPU is “out of the loop” when the compute device internally instantiates a microservice (e.g., when APP 1 instantiates a microservice as a local operation), the example microservice translation circuitry 130 located in the corresponding compute device facilitates a proper lookup of the relevant memory address ranges that must be updated. As such, during the periodic, aperiodic and/or otherwise scheduled querying of the doorbell address space 118 by the smartNIC or IPU, such microservice operating changes are promptly discovered.
As described above,
In some examples, the microservice translation circuitry 202 includes means for translating microservices, the microservice hibernation circuitry 204 includes means for hibernating services, the microservice request circuitry 206 includes means for requesting microservices, the resource discovery circuitry 208 includes means for discovering resources, the SLA analysis circuitry 210 includes means for analyzing SLA, the microservice instantiation circuitry 212 includes means for instantiating microservices, and the microservice management circuitry 200 includes means for managing microservices. For example, the means for translating microservices may be implemented by the example microservice translation circuitry 202, the means for hibernating services may be implemented by the example microservice hibernation circuitry 204, the means for requesting microservices may be implemented by the example microservice request circuitry 206, the means for discovering resources may be implemented by the example resource discovery circuitry 208, the means for analyzing SLA may be implemented by the example SLA analysis circuitry 210, the example means for instantiating microservices may be implemented by the example microservice instantiation circuitry 212, and the means for managing microservices may be implemented by the example microservice management circuitry 200. In some examples, the aforementioned circuitry may be instantiated by processor circuitry such as the example processor circuitry 512 of
While an example manner of implementing the microservice management circuitry 200 of
Flowcharts representative of example hardware logic circuitry, machine readable instructions, hardware implemented state machines, and/or any combination thereof for implementing the microservice management circuitry 200 of
The machine readable instructions described herein may be stored in one or more of a compressed format, an encrypted format, a fragmented format, a compiled format an executable format, a packaged format, etc. Machine readable instructions as described herein may be stored as data or a data structure (e.g., as portions of instructions, code, representations of code, etc.) that may be utilized to create, manufacture, and/or produce machine executable instructions. For example, the machine readable instructions may be fragmented and stored on one or more storage devices and/or computing devices (e.g., servers) located at the same or different locations of a network or collection of networks (e.g., in the cloud, in edge devices, etc.). The machine readable instructions may require one or more of installation, modification, adaptation, updating, combining, supplementing, configuring, decryption, decompression, unpacking, distribution, reassignment, compilation, etc., in order to make them directly readable, interpretable, and/or executable by a computing device and/or other machine. For example, the machine readable instructions may be stored in multiple parts, which are individually compressed, encrypted, and/or stored on separate computing devices, wherein the parts when decrypted, decompressed, and/or combined form a set of machine executable instructions that implement one or more operations that may together form a program such as that described herein.
In another example, the machine readable instructions may be stored in a state in which they may be read by processor circuitry, but require addition of a library (e.g., a dynamic link library (DLL)), a software development kit (SDK), an application programming interface (API), etc., in order to execute the machine readable instructions on a particular computing device or other device. In another example, the machine readable instructions may need to be configured (e.g., settings stored, data input, network addresses recorded, etc.) before the machine readable instructions and/or the corresponding program(s) can be executed in whole or in part. Thus, machine readable media, as used herein, may include machine readable instructions and/or program(s) regardless of the particular format or state of the machine readable instructions and/or program(s) when stored or otherwise at rest or in transit.
The machine readable instructions described herein can be represented by any past, present, or future instruction language, scripting language, programming language, etc. For example, the machine readable instructions may be represented using any of the following languages: C, C++, Java, C#, Perl, Python, JavaScript, HyperText Markup Language (HTML), Structured Query Language (SQL), Swift, etc.
As mentioned above, the example operations of
“Including” and “comprising” (and all forms and tenses thereof) are used herein to be open ended terms. Thus, whenever a claim employs any form of “include” or “comprise” (e.g., comprises, includes, comprising, including, having, etc.) as a preamble or within a claim recitation of any kind, it is to be understood that additional elements, terms, etc., may be present without falling outside the scope of the corresponding claim or recitation. As used herein, when the phrase “at least” is used as the transition term in, for example, a preamble of a claim, it is open-ended in the same manner as the term “comprising” and “including” are open ended. The term “and/or” when used, for example, in a form such as A, B, and/or C refers to any combination or subset of A, B, C such as (1) A alone, (2) B alone, (3) C alone, (4) A with B, (5) A with C, (6) B with C, or (7) A with B and with C. As used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, or (3) at least one A and at least one B. Similarly, as used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, or (3) at least one A and at least one B. As used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, or (3) at least one A and at least one B. Similarly, as used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, or (3) at least one A and at least one B.
As used herein, singular references (e.g., “a”, “an”, “first”, “second”, etc.) do not exclude a plurality. The term “a” or “an” object, as used herein, refers to one or more of that object. The terms “a” (or “an”), “one or more”, and “at least one” are used interchangeably herein. Furthermore, although individually listed, a plurality of means, elements or method actions may be implemented by, e.g., the same entity or object Additionally, although individual features may be included in different examples or claims, these may possibly be combined, and the inclusion in different examples or claims does not imply that a combination of features is not feasible and/or advantageous.
The example microservice translation circuitry 202 updates the microservice state information based on the query information retrieved from the example doorbell address space 118 (block 304). As described above, the manner of determining the state of the microservice is based on analyzing the information in particular memory address ranges that are dedicated to each particular microservice. By requiring requesting software stacks to identify microservices based on specific memory address ranges, subsequent determination efforts of microservice states can operate in a much faster manner when compared to initiating communication protocols within the Edge node. Instead, the example microservice translation circuitry 202 can quickly query bit state information of the specific microservice to determine if it is instantiated (e.g., bit state of one) or dormant/inactive (e.g., bit state of zero). Examples disclosed herein are not limited to binary state information of the specific memory address ranges, as the state information may also identify instances of microservices that are instantiated, but in a hibernated state. Still further, in some examples the state information determined by the example microservice translation circuitry 202 reveals information corresponding to memory locations where microservice executables are located (e.g., in persistent memory (e.g., fast retrieval), in disk storage (e.g., relatively slower retrieval), etc.).
The example microservice hibernation circuitry 204 updates microservice hibernation states based on metrics indicative of stale and/or otherwise under utilized microservices (block 306). Example metrics that, when satisfied in view of one or more thresholds, cause microservice hibernation include a particular frequency of microservice request, a particular priority value (e.g., criticality) associated with the microservice, etc.). The example microservice request circuitry 206 determines whether an microservice request has occurred (block 308). If not, control returns to block 302 to continue to monitor the state of microservice behaviors on the Edge device (e.g., on the example CPU 102 of
However, if the example microservice hibernation circuitry 204 determines that the microservice has already been instantiated on the Edge node (block 310), then the example resource discovery circuitry 208 identifies which resources of the Edge node are to implement the requested microservice (block 316). The example SLA analysis circuitry 210 determines whether the SLA requirements will be satisfied in view of the identified resource that is expected and/or otherwise assigned to execute the requested microservice (block 318). As described above, one or more resources of an Edge node may experience utilization demands that are non-deterministic. As such, if the computing resource is inundated with other tasks from other tenants, then SLA parameters may not be satisfied if the microservice is assigned to that inundated compute resource In such a circumstance, the example resource discovery circuitry 208 identifies available resources of the Edge node (block 320), and the microservice translation circuitry 202 translates the microservice binaries for an alternate computing resource that can accommodate the microservice demands while also maintaining SLA performance expectations (block 322). The example microservice instantiation circuitry 212 then instantiates and/or otherwise spins-up the microservice so that its one or more functions and/or tasks can begin (block 324). Thereafter, the example microservice translation circuitry 202 updates the doorbell state information (block 314). In circumstances where the example SLA analysis circuitry 210 determines that SLA parameters will be satisfied (block 318), control advances to block 324, where the example microservice instantiation circuitry 212 instantiates and/or otherwise spins-up the microservice (block 324).
In the illustrated example of
The processor platform 500 of the illustrated example includes processor circuitry 512. The processor circuitry 512 of the illustrated example is hardware. For example, the processor circuitry 512 can be implemented by one or more integrated circuits, logic circuits, FPGAs, microprocessors, CPUs, GPUs, DSPs, and/or microcontrollers from any desired family or manufacturer. The processor circuitry 512 may be implemented by one or more semiconductor based (e.g., silicon based) devices. In this example, the processor circuitry 512 implements the example microservice translation circuitry 202, the example microservice hibernation circuitry 204, the example microservice request circuitry 206, the example resource discovery circuitry 208, the example SLA analysis circuitry 210, the example microservice instantiation circuitry 212, and/or, more generally, the example microservice management circuitry 200 of
The processor circuitry 512 of the illustrated example includes a local memory 513 (e.g., a cache, registers, etc.). The processor circuitry 512 of the illustrated example is in communication with a main memory including a volatile memory 514 and a non-volatile memory 516 by a bus 518. The volatile memory 514 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS® Dynamic Random Access Memory (RDRAM®), and/or any other type of RAM device. The non-volatile memory 516 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 514, 516 of the illustrated example is controlled by a memory controller 517.
The processor platform 500 of the illustrated example also includes interface circuitry 520. The interface circuitry 520 may be implemented by hardware in accordance with any type of interface standard, such as an Ethernet interface, a universal serial bus (USB) interface, a Bluetooth® interface, a near field communication (NFC) interface, a Peripheral Component Interconnect (PCI) interface, and/or a Peripheral Component Interconnect Express (PCIe) interface.
In the illustrated example, one or more input devices 522 are connected to the interface circuitry 520. The input device(s) 522 permit(s) a user to enter data and/or commands into the processor circuitry 512. The input device(s) 522 can be implemented by, for example, an audio sensor, a microphone, a camera (still or video), a key board, a button, a mouse, a touchscreen, a track-pad, a trackball, an isopoint device, and/or a voice recognition system.
One or more output devices 524 are also connected to the interface circuitry 520 of the illustrated example. The output device(s) 524 can be implemented, for example, by display devices (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display (LCD), a cathode ray tube (CRT) display, an in-place switching (IPS) display, a touchscreen, etc.), a tactile output device, a printer, and/or speaker. The interface circuitry 520 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip, and/or graphics processor circuitry such as a GPU.
The interface circuitry 520 of the illustrated example also includes a communication device such as a transmitter, a receiver, a transceiver, a modem, a residential gateway, a wireless access point, and/or a network interface to facilitate exchange of data with external machines (e.g., computing devices of any kind) by a network 526. The communication can be by, for example, an Ethernet connection, a digital subscriber line (DSL) connection, a telephone line connection, a coaxial cable system, a satellite system, a line-of-site wireless system, a cellular telephone system, an optical connection, etc.
The processor platform 500 of the illustrated example also includes one or more mass storage devices 528 to store software and/or data. Examples of such mass storage devices 528 include magnetic storage devices, optical storage devices, floppy disk drives, HDDs, CDs, Blu-ray disk drives, redundant array of independent disks (RAID) systems, solid state storage devices such as flash memory devices and/or SSDs, and DVD drives.
The machine executable instructions 532, which may be implemented by the machine readable instructions of
The cores 602 may communicate by a first example bus 604. In some examples, the first bus 604 may implement a communication bus to effectuate communication associated with one(s) of the cores 602. For example, the first bus 604 may implement at least one of an Inter-Integrated Circuit (I2C) bus, a Serial Peripheral Interface (SPI) bus, a PCI bus, or a PCIe bus. Additionally or alternatively, the first bus 604 may implement any other type of computing or electrical bus. The cores 602 may obtain data, instructions, and/or signals from one or more external devices by example interface circuitry 606. The cores 602 may output data, instructions, and/or signals to the one or more external devices by the interface circuitry 606. Although the cores 602 of this example include example local memory 620 (e.g., Level 1 (L1) cache that may be split into an L1 data cache and an L1 instruction cache), the microprocessor 600 also includes example shared memory 610 that may be shared by the cores (e.g., Level 2 (L2_cache)) for high-speed access to data and/or instructions. Data and/or instructions may be transferred (e.g., shared) by writing to and/or reading from the shared memory 610. The local memory 620 of each of the cores 602 and the shared memory 610 may be part of a hierarchy of storage devices including multiple levels of cache memory and the main memory (e.g., the main memory 514, 516 of
Each core 602 may be referred to as a CPU, DSP, GPU, etc., or any other type of hardware circuitry. Each core 602 includes control unit circuitry 614, arithmetic and logic (AL) circuitry (sometimes referred to as an ALU) 616, a plurality of registers 618, the L1 cache 620, and a second example bus 622. Other structures may be present. For example, each core 602 may include vector unit circuitry, single instruction multiple data (SIMD) unit circuitry, load/store unit (LSU) circuitry, branch/jump unit circuitry, floating-point unit (FPU) circuitry, etc. The control unit circuitry 614 includes semiconductor-based circuits structured to control (e.g., coordinate) data movement within the corresponding core 602. The AL circuitry 616 includes semiconductor-based circuits structured to perform one or more mathematic and/or logic operations on the data within the corresponding core 602. The AL circuitry 616 of some examples performs integer based operations. In other examples, the AL circuitry 616 also performs floating point operations. In yet other examples, the AL circuitry 616 may include first AL circuitry that performs integer based operations and second AL circuitry that performs floating point operations. In some examples, the AL circuitry 616 may be referred to as an Arithmetic Logic Unit (ALU). The registers 618 are semiconductor-based structures to store data and/or instructions such as results of one or more of the operations performed by the AL circuitry 616 of the corresponding core 602. For example, the registers 618 may include vector register(s), SIMD register(s), general purpose register(s), flag register(s), segment register(s), machine specific register(s), instruction pointer register(s), control register(s), debug register(s), memory management register(s), machine check register(s), etc. The registers 618 may be arranged in a bank as shown in
Each core 602 and/or, more generally, the microprocessor 600 may include additional and/or alternate structures to those shown and described above. For example, one or more clock circuits, one or more power supplies, one or more power gates, one or more cache home agents (CHAs), one or more converged/common mesh stops (CMSs), one or more shifters (e.g., barrel shifter(s)) and/or other circuitry may be present. The microprocessor 600 is a semiconductor device fabricated to include many transistors interconnected to implement the structures described above in one or more integrated circuits (ICs) contained in one or more packages. The processor circuitry may include and/or cooperate with one or more accelerators. In some examples, accelerators are implemented by logic circuitry to perform certain tasks more quickly and/or efficiently than can be done by a general purpose processor. Examples of accelerators include ASICs and FPGAs such as those discussed herein A GPU or other programmable device can also be an accelerator. Accelerators may be on-board the processor circuitry, in the same chip package as the processor circuitry and/or in one or more separate packages from the processor circuitry.
More specifically, in contrast to the microprocessor 600 of
In the example of
The interconnections 710 of the illustrated example are conductive pathways, traces, vias, or the like that may include electrically controllable switches (e.g., transistors) whose state can be changed by programming (e.g., using an HDL instruction language) to activate or deactivate one or more connections between one or more of the logic gate circuitry 708 to program desired logic circuits.
The storage circuitry 712 of the illustrated example is structured to store result(s) of the one or more of the operations performed by corresponding logic gates. The storage circuitry 712 may be implemented by registers or the like. In the illustrated example, the storage circuitry 712 is distributed amongst the logic gate circuitry 708 to facilitate access and increase execution speed.
The example FPGA circuitry 70 of
Although
In some examples, the processor circuitry 512 of
A block diagram illustrating an example software distribution platform 805 to distribute software such as the example machine readable instructions 532 of
From the foregoing, it will be appreciated that example systems, methods, apparatus, and articles of manufacture have been disclosed that improve the responsivity of microservice instantiation. In particular, examples disclosed herein enable microservices to be instantiated from a dormant and/or otherwise hibernated state in a manner that is faster than typical monolithic architectures are capable of implementing. Examples disclosed herein manage microservice state information and instantiation activity by using high speed memory range bit reads/writes to signal and control microservice behaviors.
Example methods, apparatus, systems, and articles of manufacture for network service management are disclosed herein. Further examples and combinations thereof include the following:
Example 1 includes an apparatus to invoke a service, the apparatus comprising interface circuitry detect a request to execute the service, and processor circuitry including one or more of at least one of a central processing unit (CPU), a graphic processing unit (GPU), or a digital signal processor (DSP), the at least one of the CPU, the GPU, or the DSP having control circuitry to control data movement within the processor circuitry, arithmetic and logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a result of the one or more first operations, the instructions in the apparatus, a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and interconnections to perform one or more second operations, the storage circuitry to store a result of the one or more second operations, or Application Specific Integrate Circuitry (ASIC) including logic gate circuitry to perform one or more third operations, the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate microservice translation circuitry to query, at a first time, a memory address range corresponding to a plurality of services, and generate state information corresponding to the plurality of services at the first time, microservice request circuitry to query, at a second time, the memory address range to identify a memory address state change, the memory address state change indicative of an instantiation request for at least one of the plurality of services, and microservice instantiation circuitry to cause a first compute device to instantiate the at least one of the plurality of services.
Example 2 includes the apparatus as defined in example 1, wherein the processor circuitry is to instantiate microservice hibernation circuitry to determine whether the at least one of the plurality of services was previously instantiated.
Example 3 includes the apparatus as defined in example 2, wherein the microservice instantiation circuitry is to instantiate the at least one of the previously instantiated services from a cache memory.
Example 4 includes the apparatus as defined in example 2, wherein the processor circuitry is to instantiate resource discover circuitry to identify a plurality of compute devices available to execute the at least one of the plurality of services.
Example 5 includes the apparatus as defined in example 4, wherein the processor circuitry is to instantiate service level agreement (SLA) circuitry to determine whether the first compute device will satisfy SLA parameters.
Example 6 includes the apparatus as defined in example 5, wherein the microservice translation circuitry is to translate the at least one of the plurality of services to execute on a second one of the plurality of compute devices when the SLA parameters are not satisfied in connection with the first compute device.
Example 7 includes the apparatus as defined in example 1, wherein the first compute device includes at least one of the CPU, the GPU, an accelerator, the FPGA, a smart network interface card (NIC), or an infrastructure processing unit (IPU).
Example 7 includes the apparatus as defined in example 1, wherein the plurality of services includes microservices.
Example 8 includes At least one non-transitory computer readable medium comprising instructions that, when executed, cause processor circuitry to at least identify, at a first time, a memory address range corresponding to a plurality of services, build state information corresponding to the plurality of services at the first time, identify, at a second time, the memory address range to identify a memory address state change, the memory address state change indicative of an instantiation request for one of the plurality of services, and cause a first compute device to instantiate the one of the plurality of services.
Example 9 includes the at least one non-transitory computer readable medium as defined in example 8, wherein the instructions, when executed, cause the processor circuitry to determine whether the at least one of the plurality of services was previously instantiated.
Example 10 includes the at least one non-transitory computer readable medium as defined in example 9, wherein the instructions, when executed, cause the processor circuitry to instantiate the at least one of the previously instantiated services from a cache memory.
Example 11 includes the at least one non-transitory computer readable medium as defined in example 9, wherein the instructions, when executed, cause the processor circuitry to identify a plurality of compute devices available to execute the one of the plurality of services.
Example 12 includes the at least one non-transitory computer readable medium as defined in example 11, wherein the instructions, when executed, cause the processor circuitry to determine whether the first compute device will satisfy service level agreement (SLA) parameters.
Example 13 includes the at least one non-transitory computer readable medium as defined in example 12, wherein the instructions, when executed, cause the processor circuitry to translate the one of the plurality of services to execute on a second one of the plurality of compute devices when the SLA parameters are not satisfied in connection with the first compute device.
Example 14 includes the at least one non-transitory computer readable medium as defined in example 8, wherein the plurality of services includes microservices.
Example 15 includes an apparatus to invoke a service, the apparatus comprising means for translating services to query, at a first time, a memory address range corresponding to a plurality of services, and generate state information corresponding to the plurality of services at the first time, means for requesting microservices to query, at a second time, the memory address range to identify a memory address state change, the memory address state change indicative of an instantiation request for at least one of the plurality of services, and means for instantiating microservices to cause a first compute device to instantiate the at least one of the plurality of services.
Example 16 includes the apparatus as defined in example 15, further including means for hibernating services is to determine whether the at least one of the plurality of services was previously instantiated.
Example 17 includes the apparatus as defined in example 16, wherein the means for instantiating microservices is to instantiate the at least one of the previously instantiated services from a cache memory.
Example 18 includes the apparatus as defined in example 16, further including means for discovering resources to identify a plurality of compute devices available to execute the at least one of the plurality of services.
Example 19 includes the apparatus as defined in example 14, further including means for analyzing service level agreements (SLAs) to determine whether the first compute device will satisfy SLA parameters.
Example 20 includes the apparatus as defined in example 19, wherein the means for translating microservices is to translate the at least one of the plurality of services to execute on a second one of the plurality of compute devices when the SLA parameters are not satisfied in connection with the first compute device.
Example 21 includes the apparatus as defined in example 15, wherein the first compute device includes at least one of a central processing unit (CPU) a graphics processing unit (GPU), an accelerator, a field programmable gate array (FPGA), a smart network interface card (NIC) or an infrastructure processing unit (IPU).
Example 22 includes the apparatus as defined in example 15, wherein the plurality of services includes microservices.
Example 23 includes a method comprising extracting, by executing an instruction with processor circuitry at a first time, a memory address range corresponding to a plurality of services, composing, by executing an instruction with the processor circuitry at the first time, state information corresponding to the plurality of services, extracting, by executing an instruction with the processor circuitry at a second time, the memory address range to identify a memory address state change, the memory address state change indicative of an instantiation request for one of the plurality of microservices, and instantiating, by executing an instruction with the processor circuitry, a first compute device to execute the one of the plurality of services.
Example 24 includes the method as defined in example 23, further including determining whether the one of the plurality of services was previously instantiated.
Example 25 includes the method as defined in example 24, further including instantiating, in response to the one of the plurality of services being previously instantiated, at least one of the previously instantiated services from a cache memory.
Example 26 includes the method as defined in example 24, further including detecting a plurality of compute devices available to execute the one of the plurality of services.
Example 27 includes the method as defined in example 26, further including determining whether the first compute device will satisfy service level agreement (SLA) parameters.
Example 28 includes the method as defined in example 27, further including translating the one of the plurality of services to execute on a second one of the plurality of compute devices when the SLA parameters are not satisfied in connection with the first compute device.
Example 29 includes the method as defined in example 23, wherein the first compute device includes at least one of a central processing unit (CPU) a graphics processing unit (GPU), an accelerator, a field programmable gate array (FPGA), a smart network interface card (NIC), or an infrastructure processing unit (IPU).
The following claims are hereby incorporated into this Detailed Description by this reference. Although certain example systems, methods, apparatus, and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all systems, methods, apparatus, and articles of manufacture fairly falling within the scope of the claims of this patent.