The present invention belongs to the technical field of Internet of Things, and particularly relates to a method for supporting adaptive unloading of multi-Internet of Things (IoT) applications in an edge environment.
With rapid development of Internet of Things (IoT) and IoT equipment, a variety of resource-intensive IoT applications, for example, human face recognition, sound-semantic analysis, interactive games and augmented reality, are generated. Limited by processing capacity, memory capacity, and battery capacity, most resource-intensive IoT applications cannot operate directly on the IoT equipment.
Computing unloading is an effective method to solve resource constraints of the IoT equipment. Particularly, with proposal of mobile cloud computing (MCC), all or part of computing tasks are unloaded to a cloud server which provides huge storage and computing resources. More particularly, an application is divided at a certain particle size, and some of computing intensive tasks are unloaded to cloud for execution. Other simple tasks are processed locally, thereby shortening the response time of the application.
However, the distance between the IoT equipment and the cloud server is relatively far, which may cause a significant execution delay. In addition, massive data transmission may result in traffic congestion in a core network. To solve the above problems, mobile edge computing (MEC) is introduced. Mobile edge provides a computing capacity and storage resources superior to those of the IoT equipment and is closer to the IoT equipment, which may significantly shorten the delay.
The intensive tasks are unloaded to the cloud server or an edge server, which is indeed capable to alleviate the computing and storage pressures of the IoT equipment. However, variety of application types and complicated and variable operating environments of the MEC result in difficulty to compute unloading in MEC. Moreover, to overcome the expensive overhead brought by computing unloading is also a nonnegligible problem. Specifically speaking, there are two challenges below:
Challenge 1: an enabling mechanism. For various types of IoT applications, to design a universal method which is capable to decouple different types of strongly coupled single applications into different functional modules and support computing on-demand unloading becomes a challenge.
Challenge 2: unloading decision making. In the MEC, the computing resources are scattered in the IoT equipment, the edge server, and the cloud server, and the resources of different computing platforms have significant differences in performance. Different applications may scramble for resources in executive processes. In such a complicated and varied network environment, to obtain an appropriate unloading strategy to minimize the money overhead under a circumstance of satisfying a deadline of each application is another challenge.
Most previous studies focus on solving computing unloading of Android applications. More people transfer the target to unload a deep neural network (DNN) due to development of the DNN. However, the studies are only targeted to specific types of applications and are of no universality. Moreover, the studies are targeted to the computing unloading problem in the single application and cannot make full use of the computing resources scattered and varied in the MEC environment.
An object of the present invention is to provide a method for supporting adaptive unloading of multi-Internet of Things (IoT) applications in an edge environment. The method supports the computing unloading of different types of applications in the MEC environment, and minimizes the system overhead generated by the computing unloading.
To achieve the above object, the present invention adopts the following technical solution: a method for supporting adaptive unloading of multi-Internet of Things (IoT) applications in an edge environment, including:
Further, the universal program structure supporting on-demand computing unloading is designed through a design mode supporting independent deployment and execution of modules of the different types of applications; the design mode satisfies the following conditions: each module of the application is capable to be independently deployed and executed at different computing nodes; the module is capable to determine a computing node where a module or a database to be invoked is located, so as to guarantee invoking validity; the invoked module is capable to receive an input of an invoking module and return a correct executive result;
Further, the unloading solution generating algorithm describes an application unloading program in the MEC environment from the aspects of a network model, an application model, and a performance model, and solves the application unloading problem through the multi-task particle swarm optimization-genetic algorithm (MPSO-GA);
to compute a response time and a money overhead of the application, an unloading solution corresponding to the service and database is constructed, where SPi={sp(si1), sp(si2), . . . , sp(sio))} represents a deployment address of the service in the application ai; for each program fragment sij,k, the executive address thereof is dependent on the deployment address of the service sij, i.e., sp(sij,k)=sp(sij); and DP={dp(bi1) , dp(bi2) , . . . , dp(bil)} represents a deployment address of the database in the application ai;
is used to represent a time when a task executing the program fragment sij,k arrives at the computing node; a time when the fragment is actually executed is presented as
represents a queuing time of the fragment on the node, wherein:
of sij,k is the sum of a start time and an occupied time on the node sp(sij,k), computed as follows:
of the first program fragment si1,1 executing the primary service, and the finish time is a time
when the last fragment si1,f of the primary service is executed, wherein the response time
when the application ai is executed for the λi
NC(sij,k)=OT(sij,k)csp(s
TC(sij,k)=IT(sij,k)csp(s
The money overhead MC(sij) generated by invoking the service sij once is the sum of the using overheads and the transmission overheads of all the program fragments included by the service, computed as follows:
MC(sij)=Σk=1f(NC(sij,k)+TC(sij,k)) (9))
The money overhead MC(ai) by executing the application ai once is the sum of
products between the money overheads of all services and the number of times when the services are invoked, computed as follows:
MC(ai)=Σj=1oMC(sij)sij,Time (10)
An optimization object of the unloading solution generating algorithm is to obtain an unloading solution for services and databases to minimize the total money overhead under the condition that the response time of each application satisfies the time constraint within the time period θ; and finally, a formalized definition of the application unloading problem is as follows:
Further, a specific implementation method for the unloading solution generating algorithm based on the multi-task particle swarm optimization-genetic algorithm is as follows:
Z
i
t=(zi,1t, zi,2t, . . . , zi,pt, zi,p+1t, . . . , zi,p+qt) (12)
z
i,k
6=(nj)i,kt (13)
where p and q respectively represent total quantities of the services and databases in the whole system; in the equation (13) zi,kt (k=1, 2, . . . , p) represents the ith particle for the tth iteration of the kth service deployed by the computing node nj, and zi,kt (k=p+1, p+2, . . . , p+q) represents the ith particle for the tth iteration of the (k−q)th database deployed by the computing node nj;
F(Zit)=MCtotal(Z
Further, an implementation flow of the MPSO-GA is as follows:
S4: updating the optimal individual particle of each particle; and if there is a solution better than an original particle, updating the globally optimal particle; and
S5: if an iteration stopping condition is satisfied, finishing the algorithm; and otherwise, returning to S3 for continuous iteration.
Compared with the prior art, the present invention has the following beneficial effects: provided is a method for supporting adaptive unloading of multi-Internet of Things (IoT) applications in an edge environment, including the following steps: constructing an application compliant with a universal program structure supporting on-demand computing unloading; for the application compliant with the structure, extracting a program fragment flowchart through static code analysis; analyzing a program fragment flowchart and a context through an unloading solution generating algorithm based on a multi-task particle swarm optimization-genetic algorithm; and finally, performing, by each application, computing unloading by taking a service as a particle size according to the unloading solution to minimize the overhead under a circumstance that a deadline constraint of each application is satisfied. The method supports the computing unloading by different types of applications taking the service as the particle size. Compared with other classical methods, 2.11-17.51% of the system cost can be saved under the circumstance that the deadline constraint is satisfied.
Further description of the present invention will be made below in combination with drawings and embodiments.
It is to be noted that the detailed description below is exemplary and is intended to further describe the application. Unless specified otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the application belongs.
It should be noted that the terms used herein are merely to describe specific implementation modes rather than being intended to limit the exemplary implementation modes according to the present application. As used herein, unless otherwise specified in the context, the singular form is further intended to include plural form. In addition, it is to be further understood that when the terms “comprise” and/or “include” are used in the description, it indicates that there are features, steps, operations, apparatuses, assemblies and/or their combinations.
As shown in
The unloading enabling mechanism is a key component through which the application can perform computing unloading. The method designs a design mode with universality supporting on-demand computing unloading and proposes a corresponding program structure. The application compliant with the program structure is capable to decouple a plurality of services, and different services can be executed in a distributed manner at different computing nodes (including the IoT equipment, cloud and edge). The application determines specific computing nodes according to configuration files in the executive process, so as to achieve on-demand unloading of the program. Moreover, the application can obtain the program fragment flowchart through static code analysis to support generation of a subsequent unloading strategy.
The unloading solution generating algorithm analyzes the peripheral MEC environment and the application to obtain the optimal unloading solution. The method carries out formalized definition on the application unloading problem to be optimized by establishing a system model, an application model, and a performance model. An unloading solution for services and databases is generated through an MPSO-GA, so that of each application minimizes the overall money overhead under the circumstance that a cut-off constraint is satisfied.
As a conventional single application cannot be split, the application only can be executed on the IoT equipment or a remote server based on the overall application. To achieve on-demand computing unloading of the application, the method provides the design mode where different types of applications are independently deployed and executed in modules to design a universal program structure supporting on-demand computing unloading. The design mode satisfies the following properties:
The present invention provides an application design mode to satisfy the above attributes, as shown in
In the method, provided is a program structure of an application to satisfy the design mode in
First, the invoking path from the service to be invoked to the primary service is determined through seq and serviceName, and the address where the service is executed is determined from the configuration file SP according to the invoking path. SP stores deployment addresses (the 2nd line) of different services. If the address is equal to Local, a local function locally executes the service directly and returns a result. It is to be noted that the Local here represents the computing node when the service is invoked. (the 3rd-6th lines). If the address is not equal to Local, the service;, Parameter;, and seq are sent to an agent program at a remote node through a remote function, and a remote agent invokes the service; through a reflection mechanism and finally returns a result. (The 7th-9th lines)
First, the deployment address host of the database is determined from a configuration file DP, and DP stores the deployment addresses of different databases according to DatabaseNamei (the 2nd line). Then corresponding users and passwords (the 3rd-4th lines) are acquired from Accounts recording the database names and passwords through the host and database; and finally, a data transmission channel is established, and the channel (the 5th-6th lines) is returned.
In conclusion, “Paramsi” and “result” ensure the independence of the service. The controller formed by “seq”, “Invoke”, and “call” ensures the validity and correctness of the service invoking other services and databases in the executive process.
To provide internal flow information of the program to generate the subsequent unloading solution, it is needed to analyze the program. In the present invention, the program fragment flowchart is extracted through static code analysis, with an implementation method as follows:
The program fragment flowchart of the it h application is represented by a symbol Flowi=(Di, Si, Pi, Qi) Di={di1, di2, . . . , dil} represents a set of databases in the application, and dij.Name is used to represent a name of the database dij. Si={si1, si2, . . . , sio} represents a set of services in the application, and sij.Name is used to represent a name of the service sij. sij.callSeq represents an invoking path from the service sij to the primary service si1. sij.Time represents the total number of times that the service sij is invoked. In the presence of service invoking, each service is segmented into a plurality of serial code fragments in the actual executive process, i.e., sij={sij,1, sij,2, . . . , sij,f}. Pi is a set which records an actual program fragment executive sequence, and the program fragments in the set can be repeated. The sequence in which the program fragments are added into the set is the actual executive sequence in the application. For two adjacent fragments, in the method, the previous fragment is called as a precursor fragment of a next fragment. Qi represents a set which records correlations between the program fragments and the databases, where (sij,k, dih)∈Qi represents that there is an interaction between sij,k in the executive process and the database dih.
Each of the code fragments is defined as a triad sij,k=αij,k, βij,k, γ(sij,k, dih)
, where αij,k represents a size of data imported into a fragment sij,k; βij,k represents a task load of a fragment sij,k; and γ(sij,k, dih) represents a total size of data of a fragment sij,k in the executive process and a database dih interacting with each other, and if there is no interaction, it is set to be 0. (as prediction of specific information of the program fragments is not the content studied by the method, development personnel need to provide it in the format of json.)
The addresses of the Invoke function and the Call function in source codes are analyzed by a static code analysis technology to obtain a program fragment flowchart Flowi=(Di, Si, Pi, Qi) of the application i, with a pseudocode shown in an algorithm 3.
= (D
, S
, P
, Q
)
, U
)
← s
+ s
, P
← P
+ s
∈ U
do
)
)
.callSeq + Name
∈ s
and s
.callSeq == callSeq then
.Time ← s
.Time + 1, s
← s
.Name ← Name, s
.callSeq ← callSeq
.Time ← 1, S
← S
+ s
← GETUNITS(s
.Name)
← s
+ s
, P
← P
+ s
, U
)
)
∈ D
and d
.Name == Name then
← d
.Name ← Name, D
← D
+ d
, d
) ∉ Q
then
← Q
+ (s
, d
)
.Name ← main( ), s
.callSeq ← s
.Name, s
.Time ← 1
← GETUNITS(s
.Name)
← S
+ s
, D
← ϕ, P
← ϕ, Q
← ϕ
, U
)
indicates data missing or illegible when filed
By taking the source codes of the ith application and the service name of the primary service as inputs, the algorithm 3 constructs the corresponding program fragment flowchart, including the following steps: first, assigning information of the primary service to si1, and extracting a corresponding sentence set Ui1 through si1.Name corresponding to the 31st-32nd lines; then adding the primary service si1 into a null set Si, and setting Di, Pi, and Qi to be null, corresponding to the 33rd line; and finally, invoking a getFlow() function to construct the program fragment flowchart in recursive fashion, corresponding to the 34th line. The step of getFlow() function corresponds to the 1st to 30th lines; and sia and Uia are taken as inputs of the function to respectively represent the currently analyzed service and the sentence set corresponding to the service. In the 2nd line, pId is used to record the number of the current program fragment, the number of the current program fragment is initialized as 1, and the current program fragment sia,pId is added into sia and Pi. Each sentence in the sentence set is operated as follows: acquiring keywords in the sentence uia,j, corresponding to the 4th line. If the keywords include the keyword ‘invoke’, whether the invoked service has appeared is judged according to the invoking path callSeq. If the invoked service has appeared, the invoking number of time is added with 1; if the invoked service has not appeared, Si is added, and the corresponding sentence set is obtained through the service name, corresponding to the 5th-14th lines. In the 15th line, pId is updated, and the current new program fragment sia,pId is added into sia and Pi. In the 16th line, the service sib is subjected to getFlow() recursion. If the keywords in the sentence uia,j include the keyword “call”, in a similar way, whether the database dik has been in the set Di is judged first, and if the database is not in the set, the database is added. Then, whether (sia,pId, dih) has been in the set Qi is judged. If not, it is added, corresponding to the 18th-29th lines.
The unloading solution generating algorithm describes the application unloading problem in the MEC environment from the network model, the application model and the performance model. The network model describes “the network environment of the application”; the application model defines “a plurality of applications compliant with the program structure proposed in the method to be unloaded”; and the performance model corresponds to “the content of attempt to optimize” (introduced respectively in 3.1, 3.2, and 3.3); then, the problem to be solved by the method is described according to the above three models (section 3.4); and finally, the application unloading problem is solved through the MPSO-GA (section 3.5).
The network environment includes a cloud server (CS), a plurality of edge servers (ES), and a plurality of IoT equipment (DS). The MEC environment is modeled as a map G=(N,E) including a set N={n1, n2, . . . , nw} of the computing nodes, and a set E is used to represent sub-connections thereamong. The computing node is represented by ni=<Ψi, ωi, cicom>, where ⊥i represents the computing capacity of the computing node ni, which is usually measured by a CPU; as the method mainly focuses on seeking for the unloading solution in the study, we assume that the storage capacity of each node is capable to satisfy the corresponding demand; ωi represents the number of tasks capable of being executed by the node ni at most, and its value is equal to the nuclear number of the node; cicom represents the computing cost of the computing node ni per second. ei,j represents a connection between ni and nj, which is related to the bandwidth vi,j and the minimum link delay rtti,j. ci,jtran is used to represent the transmission overhead between the nodes ni and nj per second.
It is assumed that there will be a system going to be executed for a time θ in the
unloading process. There are a plurality of applications A={a1, a2, . . . , ag} from IoT equipment in the system and the applications ai are executed at a time interval of μi, so each application is executed for ξi=θ/μi times within a time period θ, and dli is used to represent a blocking time of the application ai. Each of the applications is compliant with a universal program structure. The application is modeled as the program fragment flowchart Flowi=(Di, Si, Pi, Qi) according to 2.3.
To compute the response time and the money overhead of the application, an unloading solution corresponding to the service and database has to be appointed. SPi{sp(si1), sp(si2), . . . , sp(sio)} represents the deployment address of the service in the application ai. As far as each program fragment sij,k is concerned, the executive address is dependent on the deployment address of the service sij, i.e. sp(sij,k)=sp(sij). DPi={dp(bi1), dp(bi2), . . . , dp(bil)} represents the deployment address of the database in the application ai.
In the λi
is used to represent a time when a task executing the program fragment sij,k arrives at the computing node. A time when the fragment is actually executed is presented as
represents a queuing time of the fragment on the node, wherein:
A data importing time when the precursor fragment six,y transfers the data to the program fragment sij,k is as follows:
A time OT(sij,k) when the program fragment sij,k occupies the node sp(sij,k) comprises a self-task execution time ET(sij,k) and a data transmission time DT(sij,k, dih)
between different databases during execution, where OT(sij,k) is computed as follows:
A finish time
of sij,k is the sum of a start time and an occupied time on the node sp(sij,k), computed as follows:
A start time when the application ai is executed for the λi
of the first program fragment si1,1 executing the primary service, and the finish time is a time
when the last fragment si1,f of the primary service is executed. The response time
when the application ai is executed for the λi
When the computing node sp(sij,k) is executed, a using overhead of the program fragment sij,k occupying the node is computed as follows:
NC(sij,k)=OT(sij,k)csp(s
during execution of the program fragment sij,k, a transmission overhead generated by interaction between the imported data and the database is computed as follows:
TC(sij,k)=IT(sij,k)csp(s
the money overhead MC(sij) generated by invoking the service sij once is the sum of the using overheads and the transmission overheads of all the program fragments included by the service, computed as follows:
MC(sij)=Σk=1f(NC(sij,k)+TC(sij,k)) (9))
The money overhead MC(ai) by executing the application ai once is the sum of
products between the money overheads of all services and the number of times when the services are invoked, computed as follows:
MC(ai)=Σj=1oMC(sij)sij.Time (10)
The optimization object of the method is to obtain an unloading solution for services and databases to minimize the total money overhead under the condition that the response time of each application satisfies the time constraint within the time period θ. Finally, a formalized definition of the application unloading problem is as follows:
For the unloading solutions SP and DP, it is aimed to explore the optimum mapping from the services and the databases to different computing nodes, so as to minimize the total money overhead under the circumstance that the deadline constraint of each application is satisfied. To explore the optimum mapping from the services and the databases to the computing nodes has been proved a NP problem. A conventional PSO has been widely applied to solving a continuous optimization problem. The optimum mapping from the services and the databases to the computing nodes is a discrete problem which needs a novel encoding method. In addition, to avoid premature convergence of a conventional particle swarm optimization, it is needed to introduce an appropriate particle updating strategy. To overcome the defects of the conventional PSO algorithm, the method provides an MPSO-GA to explore the optimal unloading strategy of multiple tasks in the MEC environment. The application unloading strategy based on MPSO-GA is described below.
A good coding strategy can improve the search efficiency and performance of the PSO algorithm, and the coding strategy of the algorithm satisfies the following three principles:
Definition 1 (integrity): each candidate solution in a problem space can be encoded into a corresponding coding particle.
Definition 2 (non-redundancy): each candidate solution in the problem space only corresponds to one coding particle.
Definition 3 (viability): each coding particle in a code space can correspond to the candidate solution in the problem space.
A usual coding strategy hardly satisfies the three principles at the same time. The method uses the particle Z to represent a candidate solution for cost driven unloading of all services and databases in the MEC environment, where the ith particle in the tth iteration is described as (12):
Z
i
t=(zi,1t, zi,1t, . . . , zi,p1, zi,p+1t, . . . zi,p+qt) (12)
z
i,k
t=(nj)i,kt (13)
where p and q respectively represent total quantities of the services and databases in the whole system. In the equation (13) zi,kt(k=1, 2, . . . , p) represents the ith particle for the tth iteration of the kth service deployed by the computing node nj, and zi,kt (k=p+1, p+2, . . . , p+q) represents the ith particle for the tth iteration of the (k−q)th database deployed by the computing node nj.
Property 1: the coding strategy is capable to satisfy the integrity and non-redundancy at the same time, but is unnecessarily capable to satisfy the viability.
Each code bit of the particle represents the deployment address of the corresponding service or database, thereby satisfying the principle of integrity. Different encoding particles respectively represent different unloading solutions. A certain feasible solution in a problem space only corresponds to one encoding particle in the code space, thereby satisfying the principle of non-redundancy. However, some candidate solutions corresponding to the particles may not satisfy the expiration date constraint, i.e., the response times of some applications exceed the time limit, so that the principle of viability cannot be satisfied. In the method, all particles can be divided into two categories: feasible particles and infeasible particles, defined below.
Definition 4: (feasible particles): in an unloading strategy corresponding to the particles, each application satisfies the expiration date constraint.
Definition 5: (infeasible particles): in the unloading strategy corresponding to the particles, execution of at least a certain application cannot satisfy the expiration date constraint.
To compare different particles, it is needed to measure the particles with the fitness function. Usually, the particles with small fitness represent better candidate solutions. This work pursues an unloading solution, which satisfies the expiration data constraint of each application while minimizing the total money overhead. Therefore, the particles with the lower money overhead can be regarded as better solutions. Because some solutions may make the response times of the applications exceed a stipulated expiration date, the particles are compared in the following three circumstances:
F(Zit)=MCtotal(Z
Conventional particle swarm optimization includes three major parts: inertia, individual cognition, and social cognition. Iterative update of each particle is affected by the individual optimum address and the globally optimal address of the current generation. Premature falling into global local optimum is a major defect of the particle swarm optimization algorithm. To improve the search ability of the algorithm, the method introduces the crossover operator and the mutation operator of the genetic algorithm for particle update. An iterative update of the ith particle in the (t+1)th iteration is shown in the equation (17):
Z
i
t+1
=F
g(Fp(Fu(Zit, ϵ, ϕu), pBestit, σp, ϕg) (17)
where Fu() represents mutation operation, Fg() and Fp() represents crossover operations, and ϵ is the inertial weight. σp and σg are acceleration coefficients. ϕu, ϕp, and ϕg are the random numbers in the interval of [0,1].
For an inertial portion, a velocity of the particle is defined as:
where Mu() represents the mutation operator, and ϵ represents the threshold when
the mutation happens. When the random number ϕu is less than ϵ, the mutation operation is executed. First, an address ind1 in the particle is selected randomly, and then a computing node is selected randomly to replace the computing node mapped onto the original ind1.
Property 2: the mutation operator can change the particle from being feasible to infeasible, vice versa.
Individual recognition and social recognition equations in the particle update are as follows:
where Cp() represents the crossover operation, pBestit represents the optimal historical particle of the ith individual in the tth iteration, gBestt represents the optimal historical particle among all the particles in the tth iteration, and σp (σg) represents a threshold of individual crossover. When the random number ϕp (ϕg) is less than a threshold σp (σg), the crossover operation is executed.
Property 3: one particle subjected to the crossover operator can be changed from being infeasible to feasible, vice versa.
(4) Mapping from the Particle to the Unloading Result
For one particle, it is needed to compute the execution time of each application and the total money overhead in the corresponding unloading solution. Therefore, the method provides an evaluation algorithm to map the particle to the unloading result. The thought of the evaluation algorithm is to compute an execution start time and an execution finish time of each program fragment by simulating the actual executive process of the program fragment, so as to compute the response time and the total money overhead of each of the application in the scheduling solution.
First, a variable curTime is set to represent the current time, initialized as 0. The input service and database unloading solutions are respectively SP and DP . A symbol λi is used to represent the number of times when the application is currently executed, with an initial value being 1. Three attributes
are defined for each program fragment sij,k, respectively representing a task arrival time of the executed fragment, a time when the fragment is executed, and a residual execution time of the fragment, and being initialized as follows:
Then the response time and the money overhead when each application is executed can be computed through the equations (6) and (10) in 3.2.3. An algorithm 4 introduces major steps of the evaluation algorithm:
, S
, P
, Q
)
(a
)
← 1,
(s
) ← (0 or None),
(s
) ← 0,
(s
) ← OT(s
), slice ← ∞
∈ N do
.empty > 0 do
(s
) ≤ curTime then
into n
.empty ← n
.empty − 1
∈ N do
in n
do
(s
) < slice then
(s
)
∈ N do
in n
do
(s
) ←
(s
) − slice
(s
) = 0 then
(s
) ← curTime
.empty ← n
.empty + 1
is the last fragment of P
and λ
<
then
← λ
+ 1,
(s
) ← max((λ
− 1)
μ
,curTime)
is not the last fragment of P
then
(s
) ← curTime + IT(s
)
∈ A do
<
then
(a
) and MC(a
) according to Eq. (6) and (10)
indicates data missing or illegible when filed
The algorithm 4, by taking the unloading solution and the application flowcharts as the inputs, finally obtains the execution time of each application and the total money overhead of the system, including the following specific steps:
S1 (the 2nd-10th lines): filling channels: adding each program fragment onto the computing node in sequence according to the unloading solution till there is no null channel on the computing node (ni.empty is used to represent the residual channel number of the node ni, with the initial value thereof being ωi), where the program fragments put into the channels need to satisfy the condition that the task to execute the program fragment has arrived the computing node at the current time, i.e.,
S2 (the 12th-18th lines): searching for the minimum timeslice: first, traversing the program fragments being executed on each computing node to find out the shortest residual execution time
and assigning the shortest residual execution time to a timeslice slice ; and then adding the current time curTime with slice to represent a time with a duration of slice
S3 (the 20th-34th lines): computing a residual time of the program fragment: subtracting the timeslice slice from the residual time
of the program fragment being executed to represent a time during a duration slice being executed; when the residual time
is 0, it represents that the program fragment has been executed completely, and a finish time
is set to be curTime; then removing the fragment from the channel, and adding 1 into an idle channel number ni.empty of the node; if it is not the last fragment, taking out the program fragment six,y to be executed from Pi, where a time
of task arrival is the sum of curTime and a data importing time IT(six,y); if the program fragment is the last fragment of Pi and the current execution times are smaller than the total execution times it indicates that the application is executed completely to enter a next execution;
of the first program fragment executed next time is the maximum value of an ideal triggering time and the current time; and
and the money overhead MC(ai) of each application executed through the equations (6) and (10).
The inertia weight ϵ may greatly affects the search ability and the astringency of the PSO algorithm. The greater the value of ϵ is, the stronger the global search ability is. The less the value of ϵ is, the stronger the local search ability is. A typical inertia weight adjustment method is as follows:
where ϵmax and ϵmin respectively represent the maximum value and the minimum value of the initialized ϵ, and iterscur and itersmax respectively represent the number of iterations of current algorithm and the maximum number of iterations initialized.
Change of ϵ in the classical update strategy is related to the number of iterations, which cannot fit a nonlinear characteristic of the minimum money overhead under multiple tasks well. Therefore, the method designs an adaptively adjusted discrete adjustment method based on advantages and disadvantages of the current swarm particles to adjust the inertia weight ϵ, shown as follows:
where d(Zit−1) represents a difference between the ith particle Zit−1 iterated currently for the (t−1)th time and the globally optimal solution gBestt−1 iterated for the (t−1)th time; τi is a statistical factor; when the value τj is 1, it represents that the computing nodes mapped by Zit−1 and gBestt−1 on the jth code are the same, and otherwise, the value is 0. Therefore, a search capability of the algorithm can be adaptively adjusted according to the difference between the current particle and the globally optimal particle.
In addition, the two recognition factors σp and σg of the algorithm are set [24] by adopting a linear increase and decrease strategy, similar to the equation (22). σpstar and σgstar respectively represent iterated initial values of parameters σp and σg, and σp and σgend respectively represent iterated final values of the parameters σp and σg.
S1: initializing related parameters of the MPSO-GA, including an initial swarm size δ, a maximum iteration number of times itersmax, a maximum inertia weight ϵmax, a minimum inertia weight ϵmin, an initial value and a finish value σpstar σgstar σpend and σgend of the acceleration coefficient; and then randomly generating an initial total;
S2: computing the fitness of each particle according to the equation (14)-(16): selecting the optimal solution of each particle itself, and selecting the particle with the optimal fitness as the globally optimal solution in the current generation;
S3: updating each particle according to the equation (13), and re-computing the fitness of each particle;
S4: updating the optimal individual particle of each particle; where if there is a solution better than an original particle, updating the globally optimal particle; and
S5: if an iteration stopping condition is satisfied, finishing the algorithm; and otherwise, returning to S3 for continuous iteration.
4 Evaluation of the method
In this part, we evaluate the method from two aspects. On the one hand, the validity of the method (section 4.1) is verified by an experiment in an actual scenario. On the other hand, the superiority (section 4.2) of the MPSO-GA in solving the multi-task and multi-server unloading problem is verified through a lot of simulation experiments.
To verify the validity of the method, experiments are carried in a real scenario according to the following research questions (RQ).
RQ1 How about the effect of the method in improving the performance of the application? (Section 4.1.2)
RQ2 Whether the extra execution overhead generated by method is within an acceptable range? (Section 4.1.3)
(1) Application: we have achieved three real world applications of the program structure mentioned herein, including a license plate recognition application (hereinafter referred to as LPRA), a target detection application (hereinafter referred to as TDA), and a human face recognition application (hereinafter referred to as FRA). In addition, the LPRA is an Android application, TDA and FRA are DNN-based applications, and the source codes can be found in our GitHub project 1.
(2) Network environment: the network environment includes four computing nodes: one IoT equipment (D1) and three remote servers (E1, E2, and C1). As shown in Table 1, we have designed three network scenarios, where the scenario 1 is only connected to the C1, the scenario 2 is connected to the E1 and C1, and the scenario 3 is connected to all the remote servers. Each element in Table 3 represents the data transmission rate v and the minimum link delay rtt, and less rtt and greater v represent a better message capacity. We use a network simulation tool Dummynet to control an available bandwidth.
(3) Equipment: we use workstations to simulate the edge servers (E1 and E2) and the cloud server (C1), where the E1 is equipped with 2.5 GHz, a 2-core CPU, and a 4GB RAM, the E2 is equipped with 3.0 GHz, a 4-core CPU, and a 8GB RAM, and the C1 is equipped with 3.9 GHz, a 8-core CPU, and a 16GB RAM. In addition, we also use two low-performance equipment as the IoT equipment (D1): Huawei Honor MYA-AL10 (1.4 GHz, 4-core CPU, 2 GB RAM) and Raspberry Pi Pi 3 B+ (1.4 GHz, 4-core CPU, 1 GB RAM). The LPRA runs on the MYA-AL10. The TDA and FRA run on the Raspberry Pi Pi 3 B+.
(3) To measure the validity of the method, we have discussed three different states of the application: (i) the original application (hereinafter referred as to OA) which cannot support computing unloading is only executed on the IoT equipment. (ii) The reconstructed application is locally executed on the IoT equipment (hereinafter referred as to RAL), and the equipment has been reconstructed from the OA to satisfy the program structure mentioned herein. (iii) The re-configured application (hereinafter referred as to RAO) is unloaded according to the unloading solution generated by the MPSO-GA, and the unloading solution can use the peripheral servers. On the basis of a rigorous principle, we have repeated the process 20 times to avoid unnecessary mistakes.
To measure the performance improvements of the method, we have executed the reconstructed applications (including LPRA, TAD, and FRA) in three different scenarios according to the unloading solution obtained by the MPSO-GA, and the experimental results are shown in
According to the results, compared with the OA, the response times of the applications after computing unloading are shortened effectively, where the LPRA shortens the time by 3.97-19.38%, the TDA shortens the time by 29.3-46.27%, and the FRA shortens the time by 23.4844.64%. To execute a neural network model usually needs a huge computing capacity. Therefore, to unload the computing intensive tasks to servers with high performance may lead to excellent performance improvements. The LPRA is a k mean value-based license plate recognition application. To balance the data transmission time and the execution time of the tasks in the LPRA, most tasks in the LPRA are more suitably executed locally, particularly in a poor network environment. Therefore, computing unloading brings non-significant performance improvements to the LPRA.
At the same time, in different network environments, the performance improvements of the application are different. In the scenario 3, the peripheral network resources are more abundant and high in communication capacity, so that the performance improvements are obviously superior to those in other conditions. In the scenario 1 only connected to the cloud server, most tasks are still locally executed, so that the performance can only be improved by 3.97-29.3%.
Generally speaking, the method can provide different scenarios and applications with the performance improvements to a certain extent.
Compared with the original application, the structure mentioned herein generates the extra execution overhead which includes the following three major parts: (1) external variables or results needed to be packaged before and after invoking the service; (2) the invoked service or database needed to be judged by the controller (the Invoke or Call function) to execute; and (3) communication and response time generated by interaction between the controller and the service or database. To measure the extra overhead, OA of different types of applications is used as a basis.
The difference between RAL and OA represents the extra overhead of the method. The difference between RAO and OA represents the performance improvement effect, where RAO takes the mean response time in different scenarios of RQ1. The experimental results are shown in
Viewed from the experimental results, the TDA and LPRA generate more extra overhead in the execution process. In comparison, the overhead of the FRA can be ignored. An important reason for this result is that compared with the TDA and TPRA, the FRA is of a simpler structure and includes fewer services and databases. The extra overhead for unloading is mainly generated when the services or databases are invoked. The more the services or databases included in the application are, the more the extra overhead needed is.
In addition, by comparing two broken lines in
Generally speaking, the extra execution overhead generated by method is within the acceptable range.
To verify the superiority of the MPSO-GA, wide simulation experiments are carried out to solve the following research questions (RQ):
RQ3: whether the MPSO-GA can obtain the optimal unloading solution in a simple network environment where different applications are independently executed?
RQ4: whether the MPSO-GA can obtain the optimal unloading solution in a more complicated network environment where different applications scramble for resources in the execution process?
It is worth noting that parameter initialization in the MPSO-GA in the experiment is simulated, where δ=100, itersmax=1000, ϵmax=0.8, ϵmin=0.2, σpstar=0.9, σpend=0.2, σgstar=0.4, and σgend=0.9.
(1) application: the same with the section 4.1, the application type includes LPRA, TDA, and FRA. In a literature, it is usually assumed that the task execution time and the allocated computing resource quantity are in a linear relationship, and the relationship between the data transmission time and the network communication capacity is the same. Therefore, by collecting the execution times and the data transmission times on different performance equipment, the computing workloads and the data transmission amplitudes of different program fragments in the application are obtained in advance, and then they are computed by least square linear regression. In addition, each type of application needs to appoint a deadline constraint to check whether the unloading solution is feasible. The deadline limit is set below.
dl1={3.1, 3.4, 3.8, 4.2} dl2={4, 5.1, 6.3, 7.6} dl3={2.6, 3, 3.5, 4.2}
dl1, dl2, and dl3 are respectively deadlines of plate license recognition, target detection and human face recognition. The smaller the value is, the stricter the time constraint is.
(2) Network environment: in this part, we have designed an application scenario, as shown in
(3) Contrast algorithm: to compare and evaluate the performance of the MPSO-GA in the MEC environment, the following three methods are introduced: (i) the conventional PSO cannot be directly used in the discrete problem herein, so that the discrete problem is converted into the continuous problem by way of taking a remainder, and the fitness function is the same as that of PSOGA; (ii) GA uses a binary problem coding method, the dimensionality thereof being equal to the number of the servers, and the fitness function thereof being the same as that in the PSOGA; and (iii) Idea: the optimal solution in the experimental process is kept to approach the real optimal solution as far as possible. In the experiment, the unloading results of same configuration may be different. Therefore, through 30 times of repeated experiments, the mean value of the feasible solution therein is taken to measure the system overhead.
To make sure that the applications will not scramble for resources, the different types of applications are respectively executed in different time periods. It is assumed that in a time period of 0-180 s, the plate license recognition application is executed at n1, n2, and n3; in a time period of 180-240 s, the target detection application is executed at n4, n5, and n6; and in a time period of 240-360 s, the human face recognition application is executed at n7, n8, and n9. The experimental results of the different types of applications in the different deadline constraints respectively correspond to
Viewed from the experimental results, by using the MPSO-GA, PSO and GA, the system overhead is reduced with widening of the deadline. This is because that the strategy based a meta-heuristic algorithm tends to assign more tasks to the servers with low overhead when the deadline is not strict. It can be seen from
Viewed either from the saving of the system overhead, the deadline difference or the proportion of the feasible solution, the effect obtained by the MPSO-GA is superior to that of PSO and GA because the MPSO-GA can adaptively adjust the search capability according to current condition and perform iterative evolution from the global angle. The PSO algorithm has the problems of poor local search capability and low search precision, which is difficult to search for the optimal solution. The performance of the GA is greatly affected by the deadline constraint because the search range thereof is local during each iteration. Generally speaking, compared with the PSO and GA, the system overheads of the MPSO-GA are respectively saved by about 3.95%-11.56% and 2.11%-17.51%.
4.2.3 RQ4: Simultaneous Execution Condition of the Applications in Different Deadline Constraints
To research the decision-making effect of the PSOGA in the more complicated network environment, the applications of the IoT equipment n1-n9 are triggered simultaneously at a time 0 s and are executed for 360 s. The experimental results are shown in
viewed from the experimental results, the three algorithms cannot obtain the feasible solution in the two stricter time constraint conditions. Moreover, the PSO and GA have the feasible solution only in the loosest time constraint. Different from RQ3, different applications scramble for resources in the execution process, so that the response times of some applications cannot satisfy the deadline. Therefore, regardless of the second strict time constraint, there is no feasible solution. However, it can be found by comparison that the solution generated in the two conditions by the MPSO-GA can make the response time of the overall application be closer to the time constraint.
At the same time, with increase of the quantity of the applications, the quantities of the services and databases needed to be determined are increased, and the size of the problem space increases exponentially. Therefore, the difficulty to obtain an appropriate unloading solution by the algorithm is greatly increased. In spite of this, the MPSO-GA is still capable to obtain the feasible solution in the third strict time constraint.
The method is capable to support the computing unloading of various applications in the MEC and multi-task and multi-server unloading decision making. The experimental results show that the method can significantly improve the performance of the application in different scenarios, and the extra execution overhead is acceptable. Besides, by simulating the comparison experiment, it is verified that the method has more opportunities to obtain the feasible solution in the stricter time constraint condition; and moreover, in the looser time constraint condition, the overhead of the obtained feasible solution is minimum.
The above is merely preferred embodiments of the present invention and is not limitation to the present invention in other forms. Those skilled in the art may alter or modify equivalent embodiments with equivalent changes by means of the disclosed technical content. Any subtle modifications, equivalent changes and modifications made on the embodiments in accordance with the technical substance of the present invention shall come within the scope of the technical scheme of the present invention without departing the content of the technical scheme of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202211475176.3 | Nov 2022 | CN | national |
This application is the continuation application of International Application No. PCT/CN2023/097420, filed on May 31, 2023, which is based upon and claims priority to Chinese Patent Application No. 202211475176.3, filed on Nov. 23, 2022, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2023/097420 | May 2023 | WO |
Child | 18411071 | US |