The disclosed technology relates to scheduling the execution of real-time tasks by processing apparatus, notably periodic real-time tasks that can be represented using a directed acyclic graph. Embodiments of the disclosed technology provide scheduling methods and scheduling apparatus that seek to promote energy-efficiency and so may be called power-aware scheduling methods and systems.
There are many applications in which processing apparatus must perform real-time tasks in a repetitive manner, thus the tasks may be considered to be periodic and there is a known time period within which each instance of the task should be completed.
u=C/D Equation (1)
For a given set T of tasks {T1, T2, . . . , Tn}, ui represents the utilization of the task Ti, umax represents the utilization of the task within the set that has the highest utilization, and Uτ is the total utilization of the task set τ, where Uτ is defined by Equation (2) below:
U
τ=τi=1nui Equation (2)
In some cases, the processing apparatus available to execute a task may be a single-core processor. In such a case it is well-known to use the EDF (Earliest Deadline First) algorithm to schedule the execution of tasks by the processor core. According to the EDF algorithm, the priority given to tasks depends only on their respective deadlines. The task having the highest priority is executed first and the other tasks remain in a queue.
Multicore processors have become ubiquitous. In the case of using a multicore processor to execute a task, it is known to use the GEDF (Global-EDF) algorithm to schedule the performance of tasks by the various cores of the processor. All the tasks are in a global queue and have an assigned global priority. Tasks run in a core according to their priority. When a core becomes free it takes from the global queue the task having highest priority.
In a wide variety of applications, a directed acyclic graph (DAG) can be used to model a task that is to be performed by processing apparatus. The DAG comprises vertices and directed edges. Each vertex represents a sub-task (or job) involved in the performance of the overall task, and the directed edges show the dependency between the different sub-tasks, i.e., which sub-task must be completed before another sub-task can be performed. The DAG representation makes it easy to understand the dependency between the sub-tasks making up a task, and the opportunities that exist for parallelism in processing the sub-tasks.
Each vertex in the graph can be an independent sub-task and, to execute such sub-tasks in real-world applications, each one may be deployed in a container such as the containers provided by Docker, Inc. As noted on the Docker Inc. website: “A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another”.
Considering the dependencies of the sub-tasks, it can be understood from
A task that can be represented using a DAG may be referred to as a DAG task. A scheduling approach for DAG tasks has been described by Saifullah et al in “Parallel Real-Time Scheduling of DAGs” (IEEE Trans. Parallel Distributed Syst., vol. 25, no. 12, 2014, pp. 3242-3252), the entire contents of which are hereby incorporated by reference. In order to schedule a general DAG task, the Saifullah et al approach implements a task decomposition that transforms the vertices of the DAG into sequential jobs, each having its own deadline and offset. The jobs can then be scheduled either pre-emptively or non-preemptively. Saifullah et al showed that in the case of applying their DAG task decomposition algorithm and then scheduling the resulting jobs using pre-emptive GEDF, it could be guaranteed that scheduling would be possible, respecting the tasks' timing constraints, for a set τ of real-time DAG tasks being executed by a multicore processor i having a number of cores Mi, provided that equation (3) below is respected:
U
τ
≤M
i/4 Equation (3)
Unfortunately, the scheduling approach described in the preceding paragraph does not take into account the energy consumption involved in the processing and, in particular, does not schedule execution of the tasks in a manner that seeks to reduce energy consumption.
Various power-management techniques are known for reducing energy consumption when processing nodes (i.e., processors, cores) execute tasks. For instance, dynamic voltage and frequency scaling (DVFS) techniques adjust the frequency of a CPU according to the current workload, by controlling the CPU voltage. At times when the workload is heavy the CPU voltage and frequency are set high, whereas at times when the workload is light the CPU voltage and frequency can be reduced so as to reduce the power required to perform the processing. DPM (dynamic power management) techniques dynamically control power usage, and thus possibly energy consumption, through controlling the CPU frequency by selecting among a number of available CPU operating modes, e.g. sleep (idle) and active (running). Power-management techniques such as these enable processing apparatus to perform required tasks using the minimum amount of power.
Research is underway to develop so-called “power-aware” scheduling techniques, i.e., scheduling techniques that can schedule execution of tasks by processing apparatus in a manner that minimizes, or at least reduces, the energy consumption. In “Node Scaling Analysis for Power-Aware Real-Time Tasks Scheduling” (IEEE Transactions on Computers, Vol. 65, No. 8, August 2016, pp 2510-2521), the entire contents of which are hereby incorporated by reference, Yu et al have proposed an approach which seeks to reduce energy consumption by adjusting an initial schedule that has been generated by a scheduling algorithm. The adjustment increases the number of processing nodes (here, processor cores) which execute the processing but slows down the speed (processor clock frequency) so as to obtain an overall reduction in energy consumption. In the Yu et al proposal, in order to determine the appropriate adjustment in the number of cores and the core speed, the number of cores and core speed initially scheduled for processing the overall task set is considered (as well as certain inherent characteristics of the processing unit itself). In the Yu et al proposal, each real-time task in the set consists of a sequence of real-time jobs which must be performed one after the other. The Yu et al proposal does not consider how to schedule DAG tasks.
The disclosed technology has been made in the light of the above issues.
Embodiments of the disclosed technology provide a computer-implemented method of scheduling periodic real-time tasks on a multi-core processor, said tasks comprising sub-tasks and sub-task dependencies capable of representation by a directed acyclic graph, the method comprising:
Embodiments of scheduling methods according to the disclosed technology enable periodic real-time DAG tasks to be scheduled in an energy-efficient manner.
In the above-mentioned scheduling methods according to embodiments of the disclosed technology, the scheduling of execution of the sub-tasks of the segments is performed using a global earliest deadline first algorithm. Use of GEDF for scheduling the sub-tasks of the segments ensures that the timing constraints of the sub-tasks are met.
In certain preferred embodiments of the disclosed technology the generating of the timing diagram assigns to each segment a first number, m, of processor cores operating at a first speed, s; and the deciding of processor-core frequency and/or speed in respect of a segment changes the number of processor cores assigned to the segment to a second number m′ and selects a second speed s′ for the second number of processor cores, according to the following process:
where Ps and C are constants in the power consumption function of the processor,
and
Use of the above-described node-scaling approach can produce power savings that are close to optimal.
Embodiments of the disclosed technology still further provide a scheduling system configured to schedule periodic real-time tasks on a multi-core processor, said tasks comprising sub-tasks and sub-task dependencies capable of representation by a directed acyclic graph, said system comprising a computing apparatus programmed to execute instructions to perform any of the above-described scheduling methods. Such a scheduling system may be embodied in an edge server.
Embodiments of the disclosed technology still further provide a computer program comprising instructions which, when the program is executed by a processor of a computing apparatus, cause said processor unit to perform any of the above-described scheduling methods.
Embodiments of the disclosed technology yet further provide a computer-readable medium comprising instructions which, when executed by a processor of a computing apparatus, cause the processor to perform to perform any of the above-described scheduling methods.
The techniques of embodiments of the disclosed technology may be applied to schedule performance of real-time tasks in many different applications including but not limited to control of autonomous vehicles, tracking of moving objects or persons, and many more.
Further features and advantages of the disclosed technology will become apparent from the following description of certain embodiments thereof, given by way of illustration only, not limitation, with reference to the accompanying drawings in which:
The disclosed technology provides embodiments of computer-implemented scheduling methods, and corresponding scheduling systems, that incorporate measures that seek to reduce energy consumption.
A computer-implemented scheduling method 400 according to a first embodiment of the disclosed technology will now be described with reference to
The main steps in the scheduling method 400 according to the first embodiment are illustrated in the flow diagram of
In a step S401, the tasks in the queue are decomposed into segments. The preferred process for decomposing a task into segments will be discussed with reference to
In the scheduling technique described by Saifullah et al op. cit., in order to determine deadlines and release times for different sub-tasks, there is an intermediate step in which tasks are decomposed into segments and the decomposition can be represented using a type of synthetic timing diagram. First of all the DAG task is represented using a timing diagram Ti∞ generated based on the assumption that the available number of processing nodes is infinite, whereby a maximum use of parallel processing is possible. This timing diagram Ti∞ is then divided up into segments by placing a vertical line in the timing diagram at each location where a sub-task starts or ends, and the segmented timing diagram may be considered to be a synthetic timing diagram Tisyn.
In a similar way, in certain preferred embodiments of the disclosed technology the DAG task is represented using a timing diagram Ti∞ generated based on the assumption that the available number of processing nodes is infinite. However, then this timing diagram Ti∞ is divided up into segments by placing a vertical line at each location where a sub-task starts and at the sub-task's deadline, yielding a new synthetic timing diagram Tisyn′.
It may be considered that the period between a pair of vertical lines in
In the Saifullah et al approach, after their segment parameters have been determined, the individual deadlines and release times of each sub-task are determined from the deadlines and release times of the segments in which they are located, and the notion of segments ceases to be relevant. However, in certain preferred embodiments of the disclosed technology power-saving measures are implemented on the basis of the segments defined in Tisyn′.
More specifically, in step S402 of the method 400, for each segment SGj, an operating frequency fj is selected for all the processing nodes involved in processing tasks during that segment SGj. Various power-saving algorithms can be applied to determine an appropriate frequency setting, for example known DVFS techniques. However, in certain preferred embodiments of the disclosed technology the number of processing nodes involved in parallel-processing the sub-tasks of a given segment is extended from the initial number m defined in the synthetic timing diagram Tisyn′ to an extended number m′, and the speeds of the processing nodes are reduced from the initial speed s defined in the synthetic timing diagram Tisyn′ to a reduced speed s′, according to the node-scaling approach described by Yu et al op. cit. This enables a reduction to be achieved in the energy consumption involved in executing the processing allocated to this segment. The operating frequency fj selected in respect of a segment SGj corresponds to the reduced speed s′ determined for the extended number m′, of processing nodes executing sub-tasks in segment SGj.
The node-scaling approach involves the following steps:
Ps and c are constants in the power consumption function of the processor (and can be determined by running applications in the processor, as explained in Yu et al op. cit.).
Then, in step S403 of method 400, the scheduling of the sub-tasks is performed, assuming the processing node numbers m′ and speeds s′ determined in step S402 for each segment SGj. In certain preferred embodiments of the disclosed technology the scheduling that is performed in step S403 makes use of the GEDF algorithm.
It should be understood that the segmenting and node-scaling approaches are used to calculate the power bound: that is, for each segment we use this approach to calculate the power bound which will be used to decide the optimal CPU speed. The scheduling of the tasks in the segment is performed using the EDF algorithm.
In effect, the method 400 cuts jobs into segments according to their release time and deadline, and the frequencies of processing nodes (cores) for jobs having the same release time are set in a manner which takes into account reduction in energy consumption. According to preferred embodiments of the scheduling method, in each segment the tasks are scheduled by global EDF, and the frequencies of cores are computed according to the method of Yu et al op. cit. The setting of the processing nodes to the computed operating frequencies may be achieved using frequency-adjustment tools provided in commercially-available processing devices (for example, when working on an Nvidia Nano platform, the nvpmodel toolkit may be used to set the computed frequencies).
The scheduling technique provided by embodiments of the disclosed technology is referred to in this document as SEDF, in view of the fact that it employs the GEDF algorithm on a per segment basis.
The segmentation in SEDF is dynamic, and new arriving tasks can be considered immediately and grouped into segments. Therefore, SEDF can be used in both static scheduling and dynamic scheduling for real-time DAG tasks in a real multi-core device.
An implementation of the overall process may be represented by the logical flow set out below, in which the input is a set τ of tasks {T1, T2, . . . , Tn}, and the number of available processing cores is N. The output from the process is a schedule for execution of the task set. The scheduling may be performed by a processing device to work out how to schedule its own execution of tasks. On the other hand, the scheduling may be performed by a first processing device to work out a schedule according to which a second device will execute the tasks.
time←−0, SE←−ø; II SE is a set of tasks and is used to collect all the sub-tasks in the segment while !stop do
SEDF is based on the estimation of the optimal power consumption theory. The estimation of the optimal power consumption for a real-time DAG task set which can be modelled as an optimization problem is NP-Hard. Inspired by dynamic programming which simplifies a complicated problem by breaking it down into simpler sub-problems in a recursive manner, tasks are aligned into several parallel threads and broken down into small segments according to their release time and deadlines to simplify the problem solving. In each segment, there are independent tasks with the same release time running on a multi-core system, and DVFS can be applied in each segment to optimize the power consumption of tasks.
The scheduling methods provided by embodiments of the disclosed technology are conveniently put into practice as computer-implemented methods. Thus, scheduling systems according to embodiments of the disclosed technology may be implemented on a general-purpose computer or device having computing capabilities, by suitable programming of the computer. Thus, scheduling methods according to embodiments of the disclosed technology may be implemented as illustrated schematically in
Furthermore, embodiments of the disclosed technology provide a computer program containing instructions which, when executed on computing apparatus, cause the apparatus to perform the method steps of one or more of the methods described above.
Embodiments of the disclosed technology further provide a non-transitory computer-readable medium storing instructions that, when executed by a computer, cause the computer to perform the method steps of one or more of the methods described above.
Simulations were performed to compare, on the one hand, the power consumption involved in executing periodic real-time DAG tasks according to scheduling determined by various known scheduling algorithms with, on the other hand, the power consumption achieved when scheduling the same tasks using an embodiment of the SEDF scheduling method according to the disclosed technology. The results of the simulations are illustrated in
The algorithms compared in the graphs of
As can be seen from
As can be seen from
Thus, it can be seen that the power-aware scheduling methods proposed by embodiments of the disclosed technology enable periodic real-time DAG tasks to be scheduled in a manner which is energy-efficient.
The scheduling methods and systems according to embodiments of the disclosed technology can be employed for scheduling the execution of periodic DAG tasks in a wide variety of applications. For example, these techniques can be applied in connection with mobile devices (phones and the like) for which conservation of battery power is an important issue, to schedule the execution of tasks (e.g., tasks involved in streaming) in an energy-efficient manner. Another application is to schedule execution of tasks in vehicle ad hoc networks, where sensor data processing often involves execution of real-time DAG tasks on edge devices or modules. Indeed, there are many edge computing scenarios where application of the scheduling methods and systems provided by embodiments of the disclosed technology can provide advantages. Below one example of such a scenario shall be discussed to facilitate understanding of the utility of the disclosed technology.
In the example illustrated in
In the example illustrated in
The system 100 may be configured to implement a variety of applications, not just the tracking application illustrated in
Although the disclosed technology has been described above with reference to certain specific embodiments, it will be understood that the disclosed technology is not limited by the particularities of the specific embodiments but, to the contrary, that numerous variations, modifications and developments may be made in the above-described embodiments within the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2022/102261 | Jun 2022 | WO | international |
Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57. This application claims priority to International Patent Application No. PCT/CN2022/102261, filed Jun. 29, 2022, the disclosure of which is hereby incorporated by reference in its entirety.