The invention relates to data processing. More precisely, one or more embodiments of the invention pertain to a method and system for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment.
Being able to use a plurality of processing devices for executing tasks is of great advantage for various reasons.
However in many cases the use of a plurality of processing devices can be challenging.
For instance, the processing devices may be of various types rendering the execution complicated.
Another issue is the fact that the environment may be dynamic.
There is a need for at least one of a method and a system that will overcome, inter alia, at least one of the above-identified drawbacks.
Features of the invention will be apparent from review of the disclosure, drawings and description of the invention below.
According to a broad aspect there is disclosed a system for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment, the system comprising a plurality of heterogeneous host machines, each heterogeneous host machine being characterized by corresponding processing resources, each heterogeneous host machine comprising: a telecommunication application for enabling the heterogeneous host machine to be part of a telecommunication network with at least one other heterogeneous host machine; a virtualization engine for executing a received virtualized element using the corresponding processing resources of the heterogeneous host machine; a geolocation module for providing at least an indication of a present position of the corresponding heterogeneous host machine; a distributed system orchestrator for managing an execution of a plurality of tasks using at least one of the plurality of heterogeneous host machines, wherein the plurality of tasks is comprised of a corresponding plurality of virtualized elements, the distributed system orchestrator comprising: a telecommunication application for enabling the distributed system orchestrator to be part of the telecommunication network comprising at least one heterogeneous host machine of the plurality of heterogeneous host machines and a task assignment module for assigning each virtualized element of the plurality of virtualized elements to a selected heterogeneous host machine located on the telecommunication network, wherein the assigning of the virtualized element is performed according to a given multi-period workload placement problem; wherein the given multi-period workload placement problem is determined by the distributed system orchestrator using at least the indication of a present position of each available heterogeneous host machine and an indication of corresponding resource availability in at least one heterogeneous host machine of the plurality of heterogeneous host machines and in accordance with at least one given criterion.
According to one or more embodiments, the multi-period workload placement problem is determined by the distributed system orchestrator using information related to heterogeneous host machines joining or leaving the telecommunication network.
According to one or more embodiments, the telecommunication network comprises a virtual ad hoc mobile telecommunication network.
According to one or more embodiments, the multi-period workload placement problem is amended in response to a given event.
According to one or more embodiments, the given event comprises a change in resources available.
According to one or more embodiments, the amendment of the multi-period workload placement problem comprises transferring a virtualized element from a first given heterogeneous host machine directly to a second given heterogeneous host machine.
According to one or more embodiments, the heterogeneous host machines are wireless host machines, further wherein the at least one given criterion is selected from a group consisting of a minimization of host machine utilization costs; a minimization of a number of migrations; a minimization of energy consumption; a minimization of refused workloads; a minimization of host machine physical movements; a throughput of at least one given host machine; a spectrum sharing behavior between at least two pairs of host machines; and an interference between at least two pairs of host machines.
According to one or more embodiments, the telecommunication application of the distributed system orchestrator reserves dedicated suitable routing paths according to the multi-period workload placement problem.
According to one or more embodiments, the given multi-period workload placement problem is further determined using at least one telecommunication network property.
According to one or more embodiments, the at least one telecommunication network property problem comprises at least one of a latency for transferring a first given virtualized element to a given heterogeneous host machine; a latency for migrating a second given virtualized element from a first given heterogeneous host machine to a second given heterogeneous host machine and a network topology.
According to one or more embodiments, the geolocation module further provides an indication of a possible future position of the corresponding heterogeneous host machine; further wherein the given multi-period workload placement problem is further determined using the indication of a possible future position of the corresponding heterogeneous host machine.
According to one or more embodiments, each heterogeneous host machine is being assigned an indication of a corresponding reputation; further wherein the given multi-period workload placement problem is further determined using the indication of a corresponding reputation.
According to one or more embodiments, each heterogeneous host machine comprises an energy module for providing an indication of a corresponding level of energy available; further wherein the given multi-period workload placement problem is further determined using the indication of a corresponding level of energy available.
According to a broad aspect, there is disclosed a method for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment, the method comprising providing a plurality of heterogeneous host machines, each given heterogeneous host machine having corresponding processing resources, each given heterogeneous host machine comprising a telecommunication application for enabling the given heterogeneous host machine to be part of a telecommunication network with at least one other heterogeneous host machine, a virtualization engine for executing a received virtualized element using the corresponding processing resources, and a geolocation module for providing at least an indication of a present position of the given heterogeneous host machine; providing a distributed system orchestrator for managing an execution of a plurality of tasks using at least one of the plurality of heterogeneous host machines with a corresponding telecommunication application for enabling the distributed system orchestrator to be part of the telecommunication network comprising at least one available heterogeneous host machine of the plurality of heterogeneous host machines and with a task assignment module for assigning each virtualized element of the plurality of virtualized elements to a selected heterogeneous host machine located on the telecommunication network; receiving, using the distributed system orchestrator, a plurality of tasks to execute, each task comprising a corresponding plurality of virtualized elements; obtaining, using the distributed system orchestrator, an indication of a present location of each available heterogeneous host machine; obtaining, using the distributed system orchestrator, an indication of a resource availability for each available heterogeneous host machine; determining, using the distributed system orchestrator, a multi-period workload placement problem using the received indication of a present location of each available heterogeneous host machine and the indication of a resource availability of each available heterogeneous host machine; and for each task of the plurality of tasks assigning each corresponding virtualized element of the plurality of corresponding virtualized elements to a corresponding host machine using the determined multi-period workload placement problem.
According to one or more embodiments, the method further comprises executing each of the assigned virtualized elements using the corresponding heterogeneous host machine.
According to one or more embodiments, the method further comprises amending the multi-period workload placement problem in response to a given event.
According to one or more embodiments, the method further comprises assigning, for each of the plurality of heterogeneous host machines, an indication of a corresponding reputation; further wherein the determining of the multi-period workload placement problem is further performed using the plurality of indications of a corresponding reputation.
According to one or more embodiments, the method further comprises obtaining an indication of a corresponding level of energy available in each of the plurality of heterogeneous host machines; further wherein the determining of the multi-period workload placement problem is further performed using the obtained indications of a corresponding level of energy available.
It will be appreciated that the system and the method disclosed above are of great advantage for various reasons.
A first reason is that they enable to use a plurality of heterogeneous host machines to execute a plurality of tasks in a dynamic environment.
Another reason is that they enable the use of heterogeneous host machines.
In order that the invention may be readily understood, embodiments of the invention are illustrated by way of example in the accompanying drawings.
Further details of the invention and its advantages will be apparent from the detailed description included below.
In the following description of the embodiments, references to the accompanying drawings are by way of illustration of an example by which the invention may be practiced.
The term “invention” and the like mean “the one or more inventions disclosed in this application,” unless expressly specified otherwise.
The terms “an aspect,” “an embodiment,” “embodiment,” “embodiments,” “the embodiment,” “the embodiments,” “one or more embodiments,” “some embodiments,” “certain embodiments,” “one embodiment,” “another embodiment” and the like mean “one or more (but not all) embodiments of the disclosed invention(s),” unless expressly specified otherwise.
A reference to “another embodiment” or “another aspect” in describing an embodiment does not imply that the referenced embodiment is mutually exclusive with another embodiment (e.g., an embodiment described before the referenced embodiment), unless expressly specified otherwise.
The terms “including,” “comprising” and variations thereof mean “including but not limited to,” unless expressly specified otherwise.
The terms “a,” “an” and “the” mean “one or more,” unless expressly specified otherwise.
The term “plurality” means “two or more,” unless expressly specified otherwise.
The term “herein” means “in the present application, including anything which may be incorporated by reference,” unless expressly specified otherwise.
The term “whereby” is used herein only to precede a clause or other set of words that express only the intended result, objective or consequence of something that is previously and explicitly recited. Thus, when the term “whereby” is used in a claim, the clause or other words that the term “whereby” modifies do not establish specific further limitations of the claim or otherwise restricts the meaning or scope of the claim.
The term “e.g.” and like terms mean “for example,” and thus do not limit the terms or phrases they explain.
The term “i.e.” and like terms mean “that is,” and thus limit the terms or phrases they explain.
Neither the Title nor the Abstract is to be taken as limiting in any way as the scope of the disclosed invention(s). The title of the present application and headings of sections provided in the present application are for convenience only, and are not to be taken as limiting the disclosure in any way.
Numerous embodiments are described in the present application, and are presented for illustrative purposes only. The described embodiments are not, and are not intended to be, limiting in any sense. The presently disclosed invention(s) are widely applicable to numerous embodiments, as is readily apparent from the disclosure. One of ordinary skill in the art will recognize that the disclosed invention(s) may be practiced with various modifications and alterations, such as structural and logical modifications. Although particular features of the disclosed invention(s) may be described with reference to one or more particular embodiments and/or drawings, it should be understood that such features are not limited to usage in the one or more particular embodiments or drawings with reference to which they are described, unless expressly specified otherwise.
With all this in mind, the present invention is directed to a method and a system for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment.
It will be appreciated that the task may be of various types. In fact, it will be appreciated that a task corresponds to a set of instructions that, during their execution, will consume a given amount of resources (e.g. computing resources, memory resources, storage resources, etc.) or physical capacities (sensors, mobility, etc.).
For instance and in a non-limiting example, in a Web server, a task may be comprised of a set of instructions to receive and manage the requests of a web browser aiming to access a web page.
In the case where an aerial picture has to be taken, a task may comprise a set of instructions to allow an Unmanned Aerial Vehicle (UAV) controlled by a Robot Operating System (ROS) to take and store a picture from a specific point with the desired angle, zoom level, resolution, etc.
Now referring to
The system 10 comprises a plurality of heterogeneous host machines and a distributed orchestrator 12. More precisely and in this specific environment, the plurality of heterogeneous host machines comprises a first heterogeneous host machine 14, a second heterogeneous host machine 16 and a third heterogeneous host machine 18. It will be appreciated by the skilled addressee that any number of heterogeneous host machines may be used.
It will be further appreciated that the plurality of heterogeneous host machines are interconnected with the distributed orchestrator 12 via a data network 20. While it is shown a single data network in
It will be appreciated that each host machine is a machine running its own Operating System (OS), e.g., Linux Ubuntu 16.04. It will be appreciated that each host machine is equipped with at least one corresponding processing resource and is characterized by corresponding physical capacities.
The at least one corresponding processing resource may be of various types.
For instance and in one embodiment, the processing resource is a central processing power which can be characterized by a number and a type of Central Processing Unit (CPU).
In another embodiment, the processing resource is a graphics processing power which can be characterized by a number and a type of Graphics Processing Unit (GPU).
In another embodiment, the processing resource is a memory space which is a Random Access Memory (RAM) and which can be characterized by a given size defined in Mbytes (MBs).
In another embodiment, the processing resource is a slow speed memory space which is of the type of the one offered by low-speed Hard Disk Drives (HDDs) and which can be characterized by a size defined in Mbytes (MBs).
In another embodiment, the processing resource is a high speed storage which is of the type of storage space offered by high-speed Solid-State Disks (SSDs) and which can be characterized by a size defined in Mbytes (MBs).
In another embodiment, the processing resource is a networking resource which can be characterized by a number of network interfaces, a bandwidth offered per network interface, and a type of network interfaces.
Moreover, it will be appreciated that the physical capabilities may comprise various sensors, such as for instance RGB camera sensors, infrared camera sensors, temperature sensors.
For instance and in accordance with an embodiment, the physical capability comprises an aerial mobility characterized by a maximum speed, a maximum altitude, etc.
For instance and in accordance with an embodiment, the physical capability comprises a ground mobility characterized by a maximum speed, a steering angle, etc.
For instance and in accordance with an embodiment, the physical capability comprises a physical transportation system characterized by a maximum payload weight, etc.
For instance and in accordance with an embodiment, the physical capability comprises an Internet connectivity.
The skilled addressee will appreciate that the physical capability may be comprised of various other elements known to the skilled addressee.
It will be appreciated that the heterogeneous host machines may therefore comprise a set of host machines having different characteristics in terms of processing resources and physical capacities.
For instance and in accordance with one embodiment, a first heterogeneous host machine may be comprised of a One Onion Omega 2+ running Linux OpenWrt and comprised of a 1 CPU-core running at 580 MHz, 128 MB or RAM, 32 MB of high-speed storage space, 1 mt7628 Wi-Fi interface split into two virtual Wi-Fi interfaces (one access point and one station).
Still in this embodiment, a second heterogeneous host machine may be comprised of a desktop server running Windows 10 and comprising an Intel® Core™ i7-7700T CPU with four 2.9 GHz cores, one Intel® HD Graphics 630, 8 GB of RAM, 1 TB of low-speed storage space, 1 Ethernet 100 Mbps interface, 1 RTL8814au Wi-Fi interface in station mode.
Still in this embodiment, a third heterogeneous host machine may be comprised of a UAV controlled by an NVIDIA TX2 running Ubuntu 16.04 for Tegra architectures and comprised of 6 CPU cores from a HMP Dual Denver 2/2 MB L2+Quad ARM® A57/2 MB L2, one Nvidia Pascal GPU with 256 cores, 8 GB of RAM, 32 GB of high-speed storage space, 1 Gbps Ethernet interface, one 80211.ac Wi-Fi interface in station mode.
The skilled addressee will appreciate that various alternative embodiments may be provided for the heterogeneous host machines.
It will be appreciated that each host machine is running a telecommunication application for enabling the host machine to be part of a telecommunication network with at least one other heterogeneous host machine. In one embodiment, the telecommunication network comprises a virtual ad hoc mobile telecommunication network.
In one embodiment, the telecommunication application comprises a software module running on each physical host machine to enable inter-host communication even through multi-hop routing paths.
For instance and in the embodiment of a set of four host machines such as for instance three Raspberry Pi 3 Model B+ and one Onion Omega 2+, the four devices are connected over Wi-Fi through a hot-spot created by the mt7628 Wi-Fi embedded interface of the Onion Omega 2+(the three RPI Wi-Fi interfaces are connected in station mode to the hot spot). The Onion Omega 2+ manages a WLAN with IP address 192.168.3.0/24, by keeping for itself the IP address 192.168.3.1 and assigning other three distinct IP addresses of the same network to the three RPIs. In this case, the telecommunication module on the Onion Omega 2+ is made by the TCP/IP stack and all related networking services of the OS combined to the Wi-Fi drivers managing the Wi-Fi interface in hot-spot mode, as well as the physical interface itself. On the three Raspberry Pi, the only difference consists in the Wi-Fi drivers used to control with network interface in station mode.
In another embodiment, the four devices are connected over multiple network interfaces. It will be appreciated that the embedded interfaces may be accompanied by other USB network interfaces. A network middleware running in the user space is run on each device to connect all of them on the same multi-hop network by exploiting all the network interfaces available. The telecommunication application of each host machine is now integrated with the network middleware and the other drivers necessary to run the additional external network interfaces.
In another embodiment, the four devices are equipped with a 5G network interface that enable all of them to keep constant connectivity with a server placed in the cloud acting as a bridge between the four devices. In such case, the telecommunication application on each node is made by the TCP/IP stack and all related networking services of the OS combined to the drivers of the 5G interface, as well as the physical interface itself. The telecommunication application includes also the software running in the cloud on the bridge server.
It will be appreciated that each host machine further comprises a virtualization engine. The virtualization engine is used for executing a received virtualized element using the corresponding processing resources of the given host machine.
It will be appreciated that a virtualization engine is a software module that is running on the top of host machines with OS and physical hardware supporting virtualization and which enables to instantiate, run, manage and stop multiple virtualized elements on the same host machine. It will be appreciated by the skilled addressee that the virtualization engine takes care of distributing the processing resources and capacities among all the virtualized elements currently running on the same host machine. It will be appreciated that various virtualization engines may be used such as for instance Docker Engine, Kubernets Engine, Hyper-V, VMWare vSphere, KVM, etc.
It will be appreciated that a virtualized element may be defined as a dedicated software environment instantiated on a host machine, capable, through the process of virtualization, of emulating functions, software modules and hardware not supported by the underlying host machine. For instance it will be appreciated that a virtualized element enables to run for instance a Linux-based application on top of a Windows host machine. It will be further appreciated that a virtualized element runs in an isolated manner with respect to other virtualized elements placed on the same host machines. Most popular examples of virtualized elements include Virtual Containers (VCs) and Virtual Machines (VMs).
It will be further appreciated that each host machine further comprises a geolocation module. The geolocation module is used for providing at least an indication of a present position of the corresponding host machine.
The geolocation module may comprise at least one of a software module and a physical interface and is used for at least estimating a current position of a host machine. The skilled addressee will appreciate that the geolocation module may be of various types.
In one embodiment, the geolocation module comprises a GPS based system comprising a GPS interface which can estimate its position by trilateration with respect to geostationary satellites, as known to the skilled addressee.
In another embodiment, the geolocation module is implemented using a Ultra-Wide Band (UWB) system. In fact, it will be appreciated that in such embodiment three host machines equipped with a UWB interface, such as for instance DWM1001 from DecaWave, may compute a relative position of a fourth host machine always equipped with a UWB interface by trilateration as known to the skilled addressee. It will be appreciated that the distance between each pair of UWB-powered host machines may be computed by estimating a flight time of each transmitted communication probe. If one host machine is chosen as origin of a reference system of coordinates, all the relative positioning measures done by each subset of four host machines can be converted according to it. It will be appreciated that such geolocation module is collaborative and therefore requires all the host machines to be on the same telecommunication network to operate.
In another embodiment, the geolocation module may be implemented using a Wi-Fi range-based system similar to UWB system. In such embodiment, host machines are equipped with a Wi-Fi interface capable of returning the Received Signal Strength Indicator (RSSI) from other host machines in range. The relative positions are computed by converting the Received Signal Strength Indicator (RSSI) into estimated distance values, e.g., by fitting a path loss function. Trilateration processes are thus based on these distance values.
The skilled addressee will appreciate that the geolocation module may be provided according to various alternative embodiments.
Still referring to
It will be appreciated that the distributed system orchestrator 12 comprises a telecommunication application for enabling the distributed system orchestrator 12 to be part of the telecommunication network comprising at least one heterogeneous host machine of the plurality of heterogeneous host machines to thereby be operationally connected with the at least one heterogeneous host machine.
The distributed system orchestrator 12 further comprises a task assignment module. The task assignment module is used for assigning each virtualized element of the plurality of virtualized elements to a selected host machine located on the telecommunication network. It will be further appreciated that the assigning of the virtualized element is performed according to a given multi-period workload placement problem.
The given multi-period workload placement is determined by the distributed system orchestrator 12 using at least the indication of a present position of each available host machine and an indication of corresponding resource availability in each of at least one host machine of the plurality of host machines and in accordance with at least one given criterion. In one embodiment, the multi-period workload placement problem is determined by the distributed system orchestrator 12 using information related to host machines joining or leaving the telecommunication network.
It will be further appreciated that in one embodiment, the given multi-period workload placement problem is further determined using at least one telecommunication network property. The at least one telecommunication network property problem may be selected from a group consisting of a latency for transferring a first given virtualized element to a given host machine, a latency for migrating a second given virtualized element from a first given host machine to a second given host machine, and a network topology.
In fact, it will be appreciated that the distributed system orchestrator 12 comprises a software module running on each host machine to manage, in a collaborative manner, virtualization and all related processes (e.g., reservation of routing paths) within a set of multiple host machines. Differently from traditional centralized orchestration solutions, e.g., VMWare vCenter, Docker Swarm, Openstack Heat, etc., the distributed system orchestrator 12 keeps virtualization decision locally, by empowering different subsets of host machines with the capability of exchanging local system information, and later take real time optimal task assignment decision. The goal of the distributed system orchestrator 12 is to find a set of task assignment decisions that optimizes at least one given criterion. The distributed nature of the distributed system orchestrator 12 is crucial to manage large set of host machines with rapidly varying physical configurations related, for instance, to host machine mobility and temporary availability.
As mentioned above, it will be appreciated that the distributed system orchestrator 12 comprises a task assignment module.
The task assignment module consists of a multi-objective placement problem defined by a Mixed-Integer-Non-Linear-Programming (MINP) formulation. It will be appreciated that in this case the workload placement problem is meant to handle workload with a multi-period nature (i.e. some tasks may not be executable simultaneously). For this reason, it is referred to as multi-period workload placement problem.
Considering a graph made by nodes and arcs representing a set of host machines (nodes) and their physical communication links (arcs), a set of workloads (applications) already placed (mapped) on the top of the set of host machines, each one represented by two dedicated graphs: wherein the first made by nodes and arcs representing a set of virtualized elements (nodes) and their communication bandwidth requirements way they are connected and wherein the second made by nodes and arcs representing a set of virtualized elements (nodes) and their parallelization/serialization constraints—already placed (mapped) on the top of the set of host machines.
Considering also a second set of workloads, represented by the same two graphs just described, demanding to be placed (mapped) on the top of the set of host machines,
It will be appreciated that a multi-period workload placement problem is a mathematical representation of the orchestration process that defines how the placement decisions, e.g., which workload node to virtualize on each host machine, which routing path to assign between different pairs of workload nodes, which workload nodes to put in the waiting queue, which workload already placed on active host machines nodes to migrate to different host machines, where to move a host machine, which host machine to assign to dedicated communication roles, etc.
It will be appreciated that the multi-period workload placement problem defines also which combinations of placement decisions are considered feasible with respect to the system parameters, e.g., the maximum resource of a host machine or the maximum bandwidth of a network link.
In one embodiment, the multi-period workload placement problem is amended in response to a given event.
It will be appreciated that the given event comprises a change in resources available in one embodiment.
It will be further appreciated that that in one embodiment the amendment of the multi-period workload placement problem comprises transferring a virtualized element from a first given host machine directly to a second given host machine.
It will be appreciated that in one embodiment, the telecommunication application of the distributed system orchestrator 12 reserves dedicated suitable routing paths according to the multi-period workload placement problem.
It will be appreciated that each virtualized element has requirements related to the above set of processing resources and capacities. In the context of the placement of a virtualized element on the top of a host machine, the required amount of processing resources is assigned from the host machine to the corresponding virtualized element. The available processing resources are computed as the difference between the total amount of processing resources offered by a host machine in idle state and those currently assigned to the virtualized elements already mapped onto it.
It will be appreciated that the multi-period workload placement problem therefore defines a multi-objective function that the distributed orchestrator is supposed to optimize when computing a multi-period-placement (task-assignment) solution (configuration). It will be appreciated that each objective component is also referred to as a criterion. It will be appreciated that the criterion may be of various types. In one embodiment the at least one criterion is selected from a group consisting of a minimization of host machine utilization costs, a minimization of a number of migrations, a minimization of energy consumption, a minimization of refused workloads, a minimization of host machine physical movements, a throughput of at least one given host machine, a spectrum sharing behavior between at least two pairs of host machines, an interference between at least two pairs of host machines, etc.
It will be appreciated that the given multi-period workload placement problem is further determined using at least one telecommunication network property.
It will be further appreciated that the at least one telecommunication network property problem comprises at least one of a latency for transferring a first given virtualized element to a given host machine; a latency for migrating a second given virtualized element from a first given host machine to a second given host machine; and a network topology.
It will be appreciated that a given event is an event that triggers the need of re-computing a new placement solution with the distributed orchestration. These events include an arrival of a new workload, a resource scarcity observed on a host machine due to unexpected virtualized element resource consumption behavior, a triggering of under-utilization thresholds, a departure of a host machine, an arrival of a new host machine, a conclusion of a task that was blocking the placement of another task of the same workload (application).
It will be appreciated that in one embodiment, the geolocation module further provides an indication of a possible future position of the corresponding host machine. In such case, the given multi-period workload placement problem is further determined using the indication of a possible future position of the corresponding host machine.
It will be appreciated that in one embodiment each heterogeneous host machine is assigned an indication of a corresponding reputation. In such case, the given multi-period workload placement problem is further determined using the indication of a corresponding reputation.
It will be further appreciated that each heterogeneous host machine comprises an energy module for providing an indication of a corresponding level of energy available. In such case, the given multi-period workload placement problem is further determined using the indication of a corresponding level of energy available.
It will be appreciated that there is also disclosed a method for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment.
According to processing step 100, a plurality of heterogeneous host machines is provided. Each given heterogeneous host machine has corresponding processing resources. Each given heterogeneous host machine comprises a telecommunication application for enabling the given heterogeneous host machine to be part of a telecommunication network with at least one other heterogeneous host machine. Each given heterogeneous host machine further comprises a virtualization engine for executing a received virtualized element using the corresponding processing resources. Each given heterogeneous host machine comprises a geolocation module for providing at least an indication of a present position of the given heterogeneous host machine.
According to processing step 102, a distributed system orchestrator is provided for managing an execution of a plurality of tasks using at least one of the plurality of heterogeneous host machines with a corresponding telecommunication application for enabling the distributed system orchestrator to be part of the telecommunication network comprising at least one available heterogeneous host machine of the plurality of heterogeneous host machines and with a task assignment module for assigning each virtualized element of the plurality of virtualized elements to a selected heterogeneous host machine located on the telecommunication network.
According to processing step 104, a plurality of tasks to execute is received using the distributed system orchestrator. Each task comprises a corresponding plurality of virtualized elements.
According to processing step 106, an indication of a present location of each available heterogeneous host machine is obtained using the distributed system orchestrator.
According to processing step 108, an indication of a resource availability for each available heterogeneous host machine is obtained using the distributed system orchestrator.
According to processing step 110, a multi-period workload placement problem is determined by the distributed system orchestrator using the received indication of a present location of each available heterogeneous host machine and the indication of a resource availability of each available heterogeneous host machine.
According to processing step 112, for each task of the plurality of tasks, each corresponding virtualized element of the plurality of corresponding virtualized elements is assigned to a corresponding host machine using the determined multi-period workload placement problem.
In one or more embodiments, the method further comprises executing each of the assigned virtualized elements using the corresponding heterogeneous host machine.
In one or more embodiments of the method, the telecommunication network comprises a virtual ad hoc mobile telecommunication network.
In one or more embodiments, the method further comprises amending the multi-period workload placement problem in response to a given event. In one or more embodiments, the given event comprises a change in resources available.
In one or more embodiments of the method, the amending of the multi-period workload placement problem comprises transferring a given virtualized element from a first given heterogeneous host machine to a second given heterogeneous host machine.
In one or more embodiments of the method, the determining of the multi-period workload placement problem is further performed using at least one property of the telecommunication network.
In one or more embodiments of the method, the method further comprises receiving, from each of the plurality of heterogeneous host machines, an indication of a possible future location; further wherein the determining of the multi-period workload placement problem is further performed using the received indications of a possible future location.
In one or more embodiments of the method, the method further comprises assigning, for each of the plurality of heterogeneous host machines, an indication of a corresponding reputation; further wherein the determining of the multi-period workload placement problem is further performed using the plurality of indications of a corresponding reputation.
In one or more embodiments of the method, the method further comprises obtaining an indication of a corresponding level of energy available in each of the plurality of heterogeneous host machines; further wherein the determining of the multi-period workload placement problem is further performed using the obtained indications of a corresponding level of energy available.
It will be appreciated that the system and the method disclosed above are of great advantage for various reasons.
A first reason is that they enable to use a plurality of heterogeneous host machines to execute a plurality of tasks in a dynamic environment.
Another reason is that they enable the use of heterogeneous host machines.
Although the above description relates to a specific preferred embodiment as presently contemplated by the inventor, it will be understood that the invention in its broad aspect includes functional equivalents of the elements described herein.
Clause 1. A system for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment, the system comprising:
a plurality of heterogeneous host machines, each heterogeneous host machine being characterized by corresponding processing resources, each heterogeneous host machine comprising:
a distributed system orchestrator for managing an execution of a plurality of tasks using at least one of the plurality of heterogeneous host machines, wherein the plurality of tasks is comprised of a corresponding plurality of virtualized elements, the distributed system orchestrator comprising:
Clause 2. The system as claimed in clause 1, wherein the multi-period workload placement problem is determined by the distributed system orchestrator using information related to heterogeneous host machines joining or leaving the telecommunication network.
Clause 3. The system as claimed in any one of clauses 1 to 2, wherein the telecommunication network comprises a virtual ad hoc mobile telecommunication network.
Clause 4. The system as claimed in any one of clauses 1 to 3, wherein the multi-period workload placement problem is amended in response to a given event.
Clause 5. The system as claimed in clause 4, wherein the given event comprises a change in resources available.
Clause 6. The system as claimed in clause 4, wherein the amendment of the multi-period workload placement problem comprises transferring a virtualized element from a first given heterogeneous host machine directly to a second given heterogeneous host machine.
Clause 7. The system as claimed in any one of clauses 1 to 6, wherein the heterogeneous host machines are wireless host machines, further wherein the at least one given criterion is selected from a group consisting of:
a minimization of host machine utilization costs;
a minimization of a number of migrations;
a minimization of energy consumption;
a minimization of refused workloads;
a minimization of host machine physical movements;
a throughput of at least one given host machine;
a spectrum sharing behavior between at least two pairs of host machines; and
an interference between at least two pairs of host machines.
Clause 8. The system as claimed in any one of clauses 1 to 7, wherein the telecommunication application of the distributed system orchestrator reserves dedicated suitable routing paths according to the multi-period workload placement problem.
Clause 9. The system as claimed in any one of clauses 1 to 8, wherein the given multi-period workload placement problem is further determined using at least one telecommunication network property.
Clause 10. The system as claimed in clause 9, wherein the at least one telecommunication network property problem comprises at least one of:
a latency for transferring a first given virtualized element to a given heterogeneous host machine;
a latency for migrating a second given virtualized element from a first given heterogeneous host machine to a second given heterogeneous host machine; and
a network topology.
Clause 11. The system as claimed in any one of clauses 1 to 10, wherein the geolocation module further provides an indication of a possible future position of the corresponding heterogeneous host machine; further wherein the given multi-period workload placement problem is further determined using the indication of a possible future position of the corresponding heterogeneous host machine.
Clause 12. The system as claimed in any one of clauses 1 to 11, wherein each heterogeneous host machine is being assigned an indication of a corresponding reputation; further wherein the given multi-period workload placement problem is further determined using the indication of a corresponding reputation.
Clause 13. The system as claimed in any one of clauses 1 to 12, wherein each heterogeneous host machine comprises an energy module for providing an indication of a corresponding level of energy available; further wherein the given multi-period workload placement problem is further determined using the indication of a corresponding level of energy available.
Clause 14. A method for enabling an execution of a plurality of tasks in a heterogeneous dynamic environment, the method comprising:
providing a plurality of heterogeneous host machines, each given heterogeneous host machine having corresponding processing resources, each given heterogeneous host machine comprising:
providing a distributed system orchestrator for managing an execution of a plurality of tasks using at least one of the plurality of heterogeneous host machines with a corresponding telecommunication application for enabling the distributed system orchestrator to be part of the telecommunication network comprising at least one available heterogeneous host machine of the plurality of heterogeneous host machines and with a task assignment module for assigning each virtualized element of the plurality of virtualized elements to a selected heterogeneous host machine located on the telecommunication network;
receiving, using the distributed system orchestrator, a plurality of tasks to execute, each task comprising a corresponding plurality of virtualized elements;
obtaining, using the distributed system orchestrator, an indication of a present location of each available heterogeneous host machine;
obtaining, using the distributed system orchestrator, an indication of a resource availability for each available heterogeneous host machine;
determining, using the distributed system orchestrator, a multi-period workload placement problem using the received indication of a present location of each available heterogeneous host machine and the indication of a resource availability of each available heterogeneous host machine; and
for each task of the plurality of tasks assigning each corresponding virtualized element of the plurality of corresponding virtualized elements to a corresponding host machine using the determined multi-period workload placement problem.
Clause 15. The method as claimed in clause 14, further comprising executing each of the assigned virtualized elements using the corresponding heterogeneous host machine.
Clause 16. The method as claimed in any one of clauses 14 to 15, wherein the telecommunication network comprises a virtual ad hoc mobile telecommunication network.
Clause 17. The method as claimed in any one of clauses 14 to 16, further comprising amending the multi-period workload placement problem in response to a given event.
Clause 18. The method as claimed in clause 17, wherein the given event comprises a change in resources available.
Clause 19. The method as claimed in any one of clauses 14 to 17, wherein the amending of the multi-period workload placement problem comprises transferring a given virtualized element from a first given heterogeneous host machine to a second given heterogeneous host machine.
Clause 20. The method as claimed in any one of clauses 14 to 19, wherein the determining of the multi-period workload placement problem is further performed using at least one property of the telecommunication network.
Clause 21. The method as claimed in any one of clauses 14 to 20, further comprising receiving, from each of the plurality of heterogeneous host machines, an indication of a possible future location; further wherein the determining of the multi-period workload placement problem is further performed using the received indications of a possible future location.
Clause 22. The method as claimed in any one of clauses 14 to 21, further comprising assigning, for each of the plurality of heterogeneous host machines, an indication of a corresponding reputation; further wherein the determining of the multi-period workload placement problem is further performed using the plurality of indications of a corresponding reputation.
Clause 23. The method as claimed in any one of clauses 14 to 22, further comprising obtaining an indication of a corresponding level of energy available in each of the plurality of heterogeneous host machines; further wherein the determining of the multi-period workload placement problem is further performed using the obtained indications of a corresponding level of energy available.
A practical implementation of a distributed multi-period orchestration system enabling the execution of a plurality of tasks on top of a heterogeneous dynamic virtualization ready physical infrastructure is presented.
The plurality of tasks:
The heterogeneous dynamic virtualization-ready physical infrastructure:
The practical implementation of a distributed multi-period orchestration system enabling the execution of a plurality of tasks on top of a heterogeneous dynamic virtualization-ready physical infrastructure relies on the following list of components:
i
i
z
ij
z
i
z
j
comp
i
z
z
j
comm
is
z
ij
ij
z
ij
z
ijn
z
ij
p
i
pi
i
z
i
traw
irs
z
i
rech
A collaborative application can be seen as a plurality of tasks (collection of workloads, application elements, application nodes, etc.) that may mutually interfere, interact, collaborate with each other. A user or a process aiming to run an application on top of a virtualization ready physical infrastructure powered by the distributed multi-period orchestrator must translate the given plurality of tasks into two virtual graphs GzV (Vz, Az) and GzT (Vz, Uz), where each task is mapped to a specific virtualized element (multiple tasks can be packed within the same virtualized element). During this translation process, the relevant application parameters are configured, e.g., flavor of each virtualized element (type of Docker container, type of Ubuntu virtual machine, etc.), CPU and RAM requirements and so on.
This operation can be naturally done through a User Interface (UI) of:
The multi-period workload generation component connected to the UI must have a network connection with at least one of the hosting machine of the virtualization ready physical infrastructure; if at least one hosting machine of the virtualization ready physical infrastructure has global internet connectivity, the multi-period workload generation component can be run somewhere in the cloud, otherwise it must run on any device locally connected to at least one hosting machine of the virtualization ready physical infrastructure, as well as directly on one of the hosting machines. In the latter case, the interaction between the user and the distributed multi-period orchestrator is enabled by a communication link provided by the telecommunication application described in Section 12.
In principle, any collaborative application (plurality of tasks) can be translated into the corresponding pair of GzV (Vz, Az) and GzT (Vz, Uz) graphs.
Examples of such collaborative applications include:
The multi-period workload generation process allows the distributed multi-period orchestrator to manage a highly heterogeneous set of applications (plurality of tasks). In particular, let us put the emphasis on the heterogeneity in terms of mobility requirements:
As already mentioned, during the multi-period workload generation process, each virtualized element that will represent one or more application tasks from the original plurality of tasks must be characterized by the corresponding set of parameters. These parameters will later allow the distributed multi-period orchestror to optimally place each virtualized element on top of the virtualization ready physical infrastructure. Here follows the detailed list of these parameters:
It is worth pointing out that, besides configuring virtualized element parameters, the user may be also requested to:
Furthermore, note that if a given application (multi-period workload) has just best-effort QoS requirements, it means that can be placed on any kind of hosting machines without accounting for their availability periods, as well as for the amount of bandwidth reserved between the multiple virtualized elements. In this case, it would be enough to create an application graph with an empty set Az, and corresponding parameters
To minimize the negative effects of hardware failures, H copies of each virtualized elements are placed on different physical servers, and a certain amount of bandwidth is reserved between original and replicated virtual elements to support the data flow generated to keep the latter up to date.
This process can be naturally modeled through a transformation of the virtual graph Gv (V, A) similar to that illustrated in Section 3.3. As shown in
Note that replicated virtualized elements are not supposed to consume any resource; however the proper amount of computing/storage resources and physical capacities (the same of the original element) has to be reserved to guarantee that the requirements will be respected in case of failure of the original virtualized element.
If storage resources are allowed to be allocated on different hosting machines with respect to those serving the computing resources (see for instance Amazon Elastic Bock Store [1]), the application graph is modified as follows (see also
Once a plurality of tasks belonging to the same application is fully translated into the corresponding pair of graphs representing a multi-period workload, the whole set of parameters that we just described is transferred to the distributed multi-period orchestration instance of at least one hosting machine. The same process is repeated whenever the user modifies the parameters of a multi-period workload already placed on top of the virtualization ready physical infrastructure.
During the life cycle of the application (multi-period workload), the hosting machine that originally received the placement request will keep updating the originating multi-period workload generation module about the state of the virtualized elements, e.g., average performance, IDs of queued virtualized elements, position of involved hosting machines, etc.
The continuous flow of application-related information between these two modules allows to exploit the multi-period nature of the distributed orchestration system to generate new virtualized elements (application nodes) in real-time: this mechanism is driven by the real-time output of the virtualized elements already running. Section 3.5 discloses an example of how real-time virtualized element (workload) generation can be leveraged in the context of a 3D mapping application powered by UAVs.
3.5 an Example: Autonomous 3D Mapping with UAVs
An autonomous 3D mapping mission can be characterized by the three-stage (multi-period) work-flow represented in
This 3-stage workload has to be further extended to generate the corresponding pair of virtual graphs GV (Vz, Az) and GzT (Vz, Uz), shown in
It is worth pointing out that a further transformation (following the logic described in Section 3.3) may be performed to graphs GV (Vz, Az) and GzT (Vz, Uz) to separate computing and storage application nodes (see
To conclude, note that the multi-period nature of the new distributed multi-period orchestration system allows the application designer to run applications (multi-period workloads) where a part of the virtualized elements (application nodes) can be generated in real-time in a on-demand fashion, according to the output of the virtualized elements (application nodes) already running. For instance, in our 3D mapping example, the number of 3D processing virtualized elements (application nodes) may be dynamically computed by the optimization algorithm run inside the 3D optimizer virtualized elements; this algorithm is designed to decide how many sub-regions have to be reconstructed in parallel to minimize 3D reconstruction computing times. Otherwise, by deciding the number 3D processing application nodes in advance, the 3D optimizer virtualized elements will simply decide which of these 3D processing nodes should be activated. The new multi-period orchestration scheme grant the application designers/owners with a substantial degree of freedom during the application development/planning stage.
The task assignment module is the core of the distributed multi-period orchestrator. It is responsible for computing the multi-period placement solution describing how to map each virtualized element on top of a hosting machine while optimizing one or multiple given criteria and respecting a given set of system constraints. The main blocks of the task assignment module consist in two strongly tied components:
It will be appreciated that the task assignment module is also referred to as the distributed multi-period orchestrator.
The multi-period workload placement problem is the mathematical representation of the orchestration process carried out to virtualize multiple multi-period workloads on top of the available virtualization ready physical infrastructure. The optimization problem is obtained by leveraging all the definitions previously presented in Table 2.
To summarize, the multi-period workload placement problem is presented below:
Given
The distributed multi-period orchestrator must decide
To minimize eight cost components
While respecting multiple problem constraints, including those to
It will be appreciated that some problem variables do not represent direct decisions of the distributed multi-period orchestrator. They are instead used as auxiliary variables to quantify objective function components and evaluate the secondary effects produced by the main decision variables. These variables can be found in Table 2.
The multi-period workload placement problem can be formally expressed by the following Mixed Integer Non-linear Programming (MINP) formulation, which is presented one group of equations at a time to make place for the corresponding descriptions:
The multi-objective function is made by eight different cost minimization components:
The first group of constraints to be added concerns the basic placement rules for the application nodes:
Equation (2) prevents the distributed multi-period orchestrator from placing an application node multiple times, while Equation (5) prevents the distributed multi-period orchestrator from removing a virtualized element (application node) already placed during previous optimization rounds. According to Equation (4), a hosting machine must be activated to host any virtualized element (application node), as well as the distributed multi-period orchestrator must respect the compatibility requirements of the hosted virtualized element (ρ and η parameters). Equation (3) states that a virtualized element (application node) i∈Vz of application z∈Z can be placed on hosting machine i∈N only if this latter is not busy, or if it is already placed on it. A busy hosting machine is typically a moving hosting machine in the process of performing a specific task of a virtualized element as well as a task in support of another virtualized element (e.g., move to improve network performance). According to Equation (6), an application z∈Z is considered placed (gz=1) if and only if its mandatory virtualized elements (application nodes) i∈Vz|βi=1 are placed during the current optimization round. Similarly, Equation (7) states that an application is considered placed if and only at least
To correctly manage the corresponding set of hosting machines, the distributed multi-period orchestrator must guarantee that enough resources are available on each hosting machine to host the desired subset of virtualized elements (application nodes). The distributed multi-period orchestrator must also consider that some virtualized elements (application nodes) may be able to share the same amount of resources when placed on the same hosting machine. The following group of constraints is introduced to correctly manage the physical resources:
Equations (19)-(22) guarantee that hosting machine resources are not consumed beyond availability, considering that some virtualized elements (those belonging to the same application type Sz and capable of sharing resources, see parameter
All the constraints related to hosting machine position and the corresponding positioning rules to be respected are now introduced:
νijX≥λiX−λjX ∀(i,j)∈E, (35)
νijX≥−λiX+λjX ∀(i,j)∈E, (36)
νijY≥λiY−λjY ∀(i,j)∈E, (37)
νijY≥−λiY+λjX ∀(i,j)∈E, (38)
i
X≥λiX−λiX ∀i∈N, (39)
i
X≥−λiX+λiX ∀i∈N, (40)
i
Y≥λiY−λiY ∀i∈N, (41)
i
Y≥−λiY−λiY ∀i∈N, (42)
x
ij
z
≤Ā
zij
FO
∀z∈Z,i∈V
z
,j∈N, (43)
λjY≤AiDON+{circumflex over (M)}(1−xijz) ∀z∈Z,i∈Vz,j∈N, (44)
λjY≥AiDOS−{circumflex over (M)}(1−xijz) ∀z∈Z,i∈Vz,j∈N, (45)
λjX≤AiDOE+{circumflex over (M)}(1−xijz) ∀z∈Z,i∈Vz,j∈N, (46)
λjX≥AiDOW−{circumflex over (M)}(1−xijz) ∀z∈Z,i∈Vz,j∈N, (47)
λiY≤AiAON ∀i∈N, (48)
λiY≥AiAOS ∀i∈N, (49)
λiX≤AiAOE ∀i∈N, (50)
λiX≥AiAOW ∀i∈N, (51)
λiX,λiY∈ ∀i∈N, (52)
νijX,νijY≥0 ∀(i,j)∈E, (53)
i
X,
(55)
Equations (35)-(38) allow to compute the X-Y distances between two different hosting machines i, j∈N|i≠j. Similarly, Equations (39)-(42) are used to estimate the X-Y distances between pre-optimization and post-optimization positions of the same hosting machine i∈N. Equation (43) allows a virtualized element (application node) i∈Vz of application z∈Z to be placed only on top of hosting machines j∈N laying within the FOA defined by the application during the workload generation phase. Equations (44)-(47) force each hosting machine to move toward the position (a valid set of coordinated within the application DOA) requested by the hosted virtualized element (application node). Thus, a hosting machine cannot host, at the same time, two different virtualized elements (application nodes) related to not overlapping DOAs. On the other side, Equations (44)-(47) prevent a hosting machine from moving beyond the boundaries of its rectangular AOA. Remind that these equations can be easily modified to account for any area shapes. For sake of completeness, Equations (52)(54) define the domains of the variables just introduced. {circumflex over (M)} is used to denote a large enough value, e.g., 100000.
Moving nodes may not be connected to an unlimited power source. For this reason, at any optimization round, the distributed multi-period orchestrator must verify that at least one reachable recharging station is in range to support each moving hosting machine. This means that the recharging station selected by the distributed multi-period orchestrator may be different from the charging station that will be selected by the energy manager described in Section 6. The following group of constraints is introduced to guarantee the availability of recharging stations:
Equation (56) forces the distributed multi-period orchestrator to assign each moving hosting machine to one hosting machine with battery recharging capabilities. Equations (57)-(60) compute the distance between a hosting machine and its assigned hosting machine with battery recharging capabilities. Equation (61) computes the traveling time necessary to reach the hosting machine with battery recharging capabilities while respecting the maxim speed of the considered moving hosting machine, while Equation (62) computes the minimum traveling time
The following group of constraints is used to manage the placement aspects related to the fact the hosting machine may appear and depart in an emergent (opportunistic, unscheduled) way:
i
x
ij
z≤κj ∀z∈Z,i∈Vz, j∈N, (68)
ψijz≥
ψijz≥
ψijz≤{circumflex over (M)}
ψjiz≥0 ∀z∈Z,i∈Vz,j∈N. (72)
Equation (68) states that a multi-period placement configuration is valid if and only if a hosting machine j∈N has a reputation κj greater than the minimum reputation level
Virtualized elements (application nodes) can be moved from their current hosting machine to another hosting machine because requested by the users (by changing, for instance, the FOA of the application node) or to mitigate resource availability problems. The next group of constraints is defined to manage this process, which can be completed by exploiting network based data transfer, as well as the physical movement of data. Note that set Ni with i E N is used to denote the set of hosting machines defined as N\{i}, while set Vzi with i∈Vz and z∈Z is used to denote the set of application nodes defined as Vz \{i}
Equation (73) is necessary to correctly activate binary migration variables any time a virtualized element (application node) is moved to a new hosting machine, while Equation (74) guarantees that only one type of migration is selected (network-based, physical active, physical opportunistic) and that the migration is not done toward a busy hosting machine. Equation (75) prevents the distributed multi-period orchestrator from commanding an active physical migration if the current hosting machine cannot move fast enough to cover the required distance before the maximum down-time delay is expired. Equation (76) forces the hosting machine supporting an active physical migration to physically move toward the destination hosting machine. It will be appreciated that the destination hosting machine will be free to move, if necessary, after the successful migration; for this reason, the pre-optimization position (not the post-optimization one) of the destination hosting machine is considered in Eq. (76). Equations (77), (80) and (82) forbid the distributed multi-period orchestrator to support physical migrations for the virtualized elements (application nodes) of a given application when the hosting machines are currently running the virtualized elements (application nodes) of other applications (in this way we prevent performance degradation for these other applications). It will be appreciated that these equation could be relaxed to allow a hosting machine to first migrate by network all the virtualized elements (application nodes) of the other applications, and then start the physical migrations. Further information on the control of variables
Equation (78) allows a hosting machine to support an opportunistic physical migration if the hosting machine itself had previously communicated that it will move toward the necessary destination hosting machine, while Equation (79) guarantees that the pre-planned movement will end before the maximum downtime period allowed for the virtualized element (application node) to be migrated expires. Equation (81) prevents a physical migration hosting machine to become the migration target of other virtualized elements (application nodes) of the same application. It will be appreciated that we do not explicitly consider virtualized elements (application nodes) of other applications because they are prevented from migrating toward a physical migration hosting machine by the presence of Equations (77) and (80). Equation (82) prevents physical migration hosting machines from hosting virtualized elements (application nodes) of other applications not involved with the migrating virtualized elements (application nodes).
Finally, Equations (83)-(85) force the distributed multi-period orchestrator to move together the virtualized elements (application nodes) sharing the same resources. For sake of completeness, the domains of migration variables are defined by Equation (86).
All the constraints and variables required to optimize routing into the virtualization ready physical infrastructure managed by the distributed multi-period orchestrator to support standard traffic demands, migration traffic, and deployment traffic are now introduced:
Equations (87)-(89) are necessary to correctly compute traffic demand placement variables y. Equation (90) states that at least A (reliability level) paths are activated to serve each traffic demand (i,j)∈Az of application z∈Z, while Equation (91) prevents the distributed multi-period orchestrator from activating the wrong paths (those not connecting the source and the destination of the corresponding traffic demand once it has been placed). Equation (92) has the same responsibility of Equation (90), but in this case tha routing paths are selected to support virtualized element (application node) migrations. Similarly to (91), Equation (93) guarantees that the activated paths are able to support the pair of hosting machines involved in the corresponding migration. Again, Equations (94)-(95) are used to activate at least Λ routing paths to support the first deployment of a virtualized element (application node), while choosing the correct paths in terms of source and destination hosting machines. Equations (96)-(98) are used to compute the total amount of flow produced on each link by each type of traffic, i.e., standard, migration-based, deployment-based. Note that Υ variables are used to discard the portion of traffic that can be shared by co-placed virtualized elements (application nodes). Finally, Equations (99)-(101) prevents the distributed multi-period orchestrator from modifying the routing variables involving busy links (e.g., links of hosting machines that are moving). For sake of completeness, variable domains are defined by Equations (102)-(105).
In fully mobile environment, network performances can be guaranteed only if node movement is somehow controlled. The moving nodes are dedicated to serve only a specific application z∈Z. In this way, the movements caused by the virtualized elements (application nodes) of an application should not interfere with the performance of other applications running on an overlapping subset of hosting machines. The following group of constraints are defined:
First, Equations (106)-(108) are used to compute the total amount of traffic carried by a link which is generated by a specific application (the three types of traffic). Note that for our purpose we do not have to consider sharing variables Υ like in Equations (96)-(98). Then, Equation (109) is used to determine whether a link is used by the traffic related to a specific application z∈Z, while Equation (110) has the same responsibility related to the fact that a hosting machine is serving traffic generated by a specific virtualized element (application node). Equations (111)-(113) allow to mark a hosting machine as communication node for a given application z∈Z if and only if it is not involved in any way with other applications (neither hosting their virtualized elements, nor serving their network traffic). Finally, according to Equation (114), only communication hosting machines assigned to a given application can move. For sake of completeness, variable domains are defined by Equations (115)-(118).
In wireless networks, there exists a potential physical network link for each pair of hosting machines with a wireless network interface. The network bandwidth offered by each wireless link is related to the distance between the hosting machines at the extremities of the considered links. Note that in case of wired links, the link throughput/capacity is instead fixed (one single horizontal piece). The following group of constraints allows to correctly compute the current link capacities and, consequently, to respect them:
Equation (119) is used to correctly activate the right piece of the throughput distance function of each physical link, while Equation (120) imposes that one piece of that function is activated per link. Equations (121) and (122) prevent the capacity of each link from being overutilized (with both pre-optimization and post-optimization node positions). Equations (123)-(124) compute the link delay with pre-optimization and post-optimization node positions, while Equations (125)-(126) do the same but for path delays. Finally, Equations (127) and (128) enforce maximum path delay constraints, by considering both pre-optimization and postoptimization positions. For sake of completeness, variable domains are defined by Equations (129)-(131).
Wireless nodes communicating over the same Wireless Local Area Network (WLAN) are typically required to configure all the D2D wireless link on the same transmission channel. This leads all the links of the same WLAN that are in range with respect to each other to share the same spectrum, and thus the same transmission capacity. The following group of constraints is introduced to model this phenomenon:
Equation (133) is necessary to evaluate when a hosting machine is close enough to another hosting machine to be considered as a member of the wireless cell of this latter. Equations (134)-(135) are used to determine the physical links that are members of a given wireless cell: it is sufficient the one of the two edges of the considered link is member of the wireless cell itself. Equations (136) and (137) prevent the capacity of each wireless cell from being over-utilized (with both pre-optimization and post-optimization node positions). Finally, Equations (138)-(139) compute the wireless cell utilization costs by considering both pre-optimization and post-optimization node positions. For sake of completeness, variable domains are defined by Equations (140)-(141).
The MINP formulation just presented in Section 4.1 to define the multi-period workload placement problem is crucial to:
The role of the distributed multi-period orchestrator is to heuristically compute, in real-time, a feasible and optimal placement solution.
A small part of the information necessary to solve the multi-period workload placement problem is found directly in configuration files visible to the distributed multi-period orchestrator instance (see Section 4.3) running on each hosting machine. The remaining information is instead collected by the distributed multi-period orchestrator instance of each hosting machine from the other auxiliary modules (see Section 4.4).
The implementation details of the distributed multi-placement workload placement algorithm run by the distributed multi-period orchestrator instance of each hosting machine (when necessary) are now introduced. A founding principle of the algorithm is that the optimization process should not consider, at each optimization iteration, the whole virtualization ready physical infrastructure. Such a global approach would create issues in terms of:
To mitigate such problems, multiple sub-clusters i∈Q made by hosting machines and links laying in close proximity (in terms of hop-distance) are dynamically built. In this way, each sub-cluster i∈Q can solve a small-size instance of the multi-period workload placement problem involving just the hosting machines belonging to the corresponding sub-cluster, i.e.:
And all related parameters. The flow process describing the optimal orchestration mechanism is now presented:
A triggering event requiring placement optimization is registered by the distributed multi-period orchestration instance of a hosting machine belonging to N:
The generation of a multi-period placement optimization or multi-period placement re-organization request triggers the dynamic formation of new sub-clusters. First of all, the behavior of the hosting machine whose distributed multi-period orchestrator instance generated the optimization request is now analyzed:
All the hosting machines already supervising a sub-cluster that receive a request will automatically try to solve the multi-period workload placement problem within the same sub-cluster. Otherwise, each hosting machine has a certain probability of launching the formation of a new sub-cluster that it will supervise. Note that each supervisor candidate can build multiple clusters of different size in terms of hop-distance from the supervisor hosting machine. The cluster formation managed by a supervisor hosting machine is performed through a consensus algorithm supported by DASS to distribute the necessary information.
Before being ready to compute the best multi-period workload placement solution, the clusters must be further extended to account for:
It will be appreciated that sub-cluster supervisors may be controlled by a specific algorithms aiming to merge overlapping sub-clusters. Furthermore, other algorithms may be constantly run to delete sub-clusters that become idle, as well as split two portions of the same sub-cluster that do not interact among themselves.
The supervisor hosting machine of a sub-cluster distributes all the new application information to the distributed multi-period orchestrators instances of all the sub-cluster members (through DASS, see Section 5). If the sub-cluster is new, all the hosting machine distributed multi-period orchestrator instances in the sub-cluster will distribute, always with DASS, all the other problem parameters. Otherwise, these information should be already available on each hosting machine.
Once each sub-cluster distributed multi-period orchestrator instance retrieve all the necessary problem parameters, it repeats a certain number of iterations of one or more resolution algorithms. At the end of the process, or after a user-configured time-out, the solution with the best objective function is the only kept. It will be appreciated that any algorithm generating feasible solutions for the MINP formulation of Section 4.1 can be leveraged, including meta-heuristics, local searches, greedy algorithms, genetic algorithms and many others. In this case we propose to use two different greedy algorithms, Feasible Placement (FP) and Optimal Placement (OP), each applied in two different modes, i.e., partial (only the variables related to the application nodes directly involved in the placement optimization—e.g., those of a new application)—can be adjusted) and full (the whole sub-cluster variables can be optimized).
Partial FP and OP should be tried first to avoid migrations and configuration adjustments that may negatively affect the performance of the application nodes already running. In case the solutions of partial methods are not considered good enough, full FP and OP are launched to look for better solutions.
Both FP and OP are based on the same macro-routines:
The macro-routine above is combined to describe the FP algorithm:
If the BF algorithm is considered, the only difference with respect to the FP procedure just described is the fact that all the hosting machines of the RSPN list are tested with FE (instead of passing to the next step any time a feasible solution is identified) to allow the algorithm to choose the best local decision. It will be appreciated that the greedy approaches of both FP and OP can lead to local optima with a significant gap from the real optimum solution. It will be appreciated that an additional step can be added between FTPV and FTFV to test different the migration types. In a FP approach, the first feasible migration type is maintained, while in a BF approach, all the three migration types could be evaluated (network-based, physical active, physical opportunistic).
DASS is then used by each sub-cluster distributed multi-period orchestrator instance (one per hosting machine) to share the best objective function found. The sub-cluster supervisor will then select the best value and retrieve the corresponding placement solution from the multi-period orchestrator instance that obtained it.
It is worth pointing out that the resolution scheme just presented can be naturally applied to any version of the multi-period workload optimization problem. It could also be easily adapted also to deal with other mathematical formulations for the same problem.
All the sub-cluster supervisor hosting machines will transmit the pair composed by the best objective function and the corresponding multi-period workload placement solution to the distributed multi-period orchestrator instance that originally generated the optimization/re-organization request. This distributed multi-period orchestrator instance is thus responsible of comparing all the solutions received within a pre-configured time limit by multiple sub-cluster supervisors and electing the sub-cluster that won the multi-period workload placement bidding process. The ID and address of the supervisor of the winning sub-cluster is also communicated to the multi-period workload generation module used to create, manage and stop the applications.
When the distributed multi-period orchestrator instance is initialized on a hosting machine, a configuration file created by the virtualization ready physical infrastructure manager is read to correctly set some input parameters directly related to the distributed orchestration process:
The distributed multi-period orchestrator instance running on each hosting machine exploits the data distribution/replication services of the DASS to coordinate the distributed solution computation process. A large portion of these interactions has been already documented in Section 4.2. However, it was not mentioned that DASS is crucial to force all the distributed multi-period orchestrator instances to converge to the same set of orchestration parameters (see Section 4.3). This specific convergence task can be executed in collaboration with the access manager described in Section 10.
The distributed multi-period orchestrator instance retrieves all the parameters related to the hosting machines and links of the same sub-clusters by interrogating the other modules running on the same physical machine:
It will be appreciated that each of the modules above retrieves the information from the surrounding hosting machines through the DASS instance running on each hosting machine.
The telecommunication application and the virtualization engine receives all the resource and bandwidth reservation instructions related to the implementation of a new multi-period workload placement configuration. Finally, the distributed multi-period orchestrator instance transmits to the geo-location module all the FOA information of virtualized elements (application nodes) demanding placement; in this way the geolocation module will be able to return the list of hosting machines of the sub-cluster of interests that are compatible with the FOA.
A special virtual component is represented by a Distributed Database (DD) middleware specifically tailored to run on the top of Mobile Ad-Hoc Networks (MANETs) and Opportunistic Networks (ONs), and compatible with any kind of network. A DD middleware called Distributed Advanced Storage Service (DASS) was developed. It:
A DASS instance is run in a dedicated virtual container that is pre-deployed on each hosting machine aiming to participate to the virtualization ready physical infrastructure. The DASS instance is leveraged by the distributed multi-period orchestrator instance of each hosting machine to distribute all the information required by the distributed multi-period workload placement algorithms to build the local sub-clusters and compute the corresponding multi-period workload placement configurations for an application demanding for resources. As already pointed out in the previous section, DASS is exploited by all the other modules (not only the orchestrator) to distribute information across the hosting machines of the virtualization ready physical infrastructure.
The energy manager has the main responsibility of triggering battery recharging procedures (no run by the distributed multi-period orchestration system) that temporary exclude a hosting machine from the virtualization ready physical infrastructure (it is marked as busy through the corresponding γ parameter) to give it time to fulfill recharging procedures. Note that the Θ variables modified by the distributed multi-period orchestrator to assign each moving node to a recharging station are simply used to guarantee that a close enough recharging station is always available; however, these variables have no impact with the energy management routines of the energy management layer.
This module is used to configure:
These parameters are transmitted to the orchestrators of the same hosting machine, as well as they are distributed to the surrounding hosting machines through DASS.
At run-time (at each optimization round) the energy management daemon communicates to the distributed multi-period orchestrator instance of its hosting machine all the real-time battery autonomy data σ.
The multi-period workload placement solution computed by the distributed multi-period orchestration system determines the final position assigned to a moving hosting machines to satisfy a virtualized element (application node). The solution guarantees that all network related constraints are satisfied by considering both pre-optimization and post-optimization positions of the hosting machines.
The network aware path manager is an auxiliary module that has the responsibility of coordinating the movements of all the moving hosting machines. Its goal is to guarantee that the final network configuration computed by the distributed multi-period orchestration system by considering the hosting machines placed in their destination positions will remain valid along the whole traveling period. It will be appreciated that this process can be decomposed in multiple independent sub-instances (one per application interested by moving tasks) thanks to the problem constraints (111)-(113) that prevent the distributed multi-period orchestrator from co-placing a moving virtualized element with another virtualized element of a different application.
The path planning algorithm can be implemented in many different ways. It can be a centralized path planning algorithm running on each sub-cluster supervisor hosting machine, as well as a distributed network maintenance system based on proper node attraction parameters aiming to keep close the physical edges of the relevant links (see the potential-based method used in [2]).
It will be appreciated that the path-planner is also responsible of physically moving the underlying hosting machine.
A system based on a software module and a physical interface, or by the combination of more of them, capable of estimating the current position of a host machine.
Examples of geo-location modules include:
This module also computes, following requests of the distributed multi-period orchestrator, the binary geo-localization parameters ĀzijFO that determine the hosting machines that, based on their location, are authorized to host a given application.
Each hosting machine that becomes member of the virtualization ready physical infrastructure runs the so-called reputation estimator, a software module responsible for computing a reputation score κi, of each hosting machine i∈N.
A reputation value is assigned to hosting machine by all the other hosting machines available on the telecommunication network. The reputation value is then continuously updated as operations keep running and hosting machines show their level of reliability and participation. Practically speaking, a hosting machine that appears for the first time should receive a basic reputation score from all the other hosting machines. This score can be then progressively improved as the new hosting machine keeps hosting new virtualized elements (application nodes) while guaranteeing the desired level of QoS. In terms of practical implementation, each hosting machine is constantly informed of the state of the other hosting machines laying within a certain hop distance (information is shared through DASS, see Section 5). Then, each hosting machine merges this real-time information with the historical data available on the surrounding hosting machines to determine metrics such as:
These metrics are then elaborated by an algorithm to extract the instantaneous reputation score assigned to a surrounding hosting machine. The reputation values are constantly distributed across the hosting machines of the virtualization ready physical infrastructure, so that the final reputation value assigned to a hosting machine and used by the distributed multi-period orchestrator is the result of a collaborative estimation effort. In fact, due to the opportunistic nature of virtualization ready physical infrastructure management process, a hosting machine considered unreliable by a certain neighbor may be estimated as very efficient by another (due to past collaborations in a common virtualization ready physical infrastructure).
This module has the responsibilities of managing the first interactions with a new hosting machine appeared as direct neighbor on the underlying telecommunication network. In particular, it will take care of:
Each hosting machine participating to a virtualization ready physical infrastructure runs the so-called virtualization engine, i.e., a software module whose main responsibilities include:
Note that the OS and the physical hardware of a physical server running a virtualization engine must be configured to allow resource virtualization. For instance, with Intel machines, the Intel Virtualization Technology option must be enabled into the BIOS menu. Examples of popular virtualization engines include:
The virtualization engine keeps informing the distributed multi-period orchestration instance of the same hosting machine about:
The whole virtualization ready physical infrastructure relies on a telecommunication network interconnecting all the hosting machine. In this implementation, the ad-hoc communication network built by the HEAVEN communication middleware is considered. HEAVEN is a middleware running in the user space, and thus potentially compatible with any kind of device without the need of modifying the underlying Operating System (OS).
HEAVEN builds a virtual network layer able to seamlessly interact (through dedicated virtual link layers) with different types of network transmission technologies. For instance, HEAVEN can manage Wi-Fi interfaces running in ad-hoc (or IBSS) mode [3], as well as Wi-Fi interfaces acting as base station or client in a traditional infrastructure mode.
HEAVEN offers the both unicast and broadcast communication services, by relying on three types of routing protocols:
HEAVEN is responsible for discovering new available network nodes and authorizing them to participate to the network. HEAVEN provides all the APIs required by the architecture orchestrator to collect the network information related to the network parameters of the multi-period workload placement problem:
The telecommunication network is also meant to receive the bandwidth allocation instructions directly form the distributed multi-period orchestration instance running above.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2020/052835 | 3/25/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62824047 | Mar 2019 | US |