This Application claims the benefit of Taiwan Patent Application No. 105132698, filed on Oct. 11, 2016, the entirety of which is incorporated by reference herein.
The present invention relates to task management of an OS (Operating System), and in particular, to methods for determining processing nodes for executed tasks and apparatuses using the same.
NUMA (Non-uniform memory access) is a computer memory design used in multiprocessing to improve the waiting time for processing the accessing of data stored in memory, where the memory access time depends on the memory location relative to the processor. NUMA provides the architecture, in which each processor (or each group of processors) is allocated to a respective memory (that is, a local memory). Under NUMA, a processor can access its own local memory faster than a non-local memory (such as a memory local to another processor or a memory shared between processors). The conventional OS (Operating System) kernel determines which processing node of NUMA is used to execute a task according to its frequencies for accessing local and non-local memories. However, the execution efficiency does not depend solely on the factor of memory access. Thus, it is desirable to have methods for determining processing nodes for executed tasks and apparatuses using the same to improve execution efficiency by taking other factors into account.
An embodiment of the invention introduces a method for determining processing nodes for executed tasks, performed by a processor when loading and executing a daemon, and comprising: obtaining a first evaluation score associated with usages of I/O devices of a first node by a task in a time interval; obtaining a second evaluation score associated with usages of I/O devices of a second node by the task in the time interval, in which the task is executed by a processor of the first node; and when the second evaluation score is higher than the first evaluation score, switching execution of the task to a processor of the second node.
An embodiment of the invention introduces an apparatus for determining processing nodes for executed tasks including: a first node and a second node, in which the first node includes a processor loading and executing a daemon and a task. The daemon obtains a first evaluation score associated with usages of I/O devices of the first node by the task in a time interval; obtains a second evaluation score associated with usages of I/O devices of the second node by the task in the time interval; and, when the second evaluation score is higher than the first evaluation score, switches execution of the task to the processor of the second node.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
The present invention can be fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
The following description is of the well-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
The present invention will be described with respect to particular embodiments and with reference to certain drawings, but the invention is not limited thereto and is only limited by the claims. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Use of ordinal terms such as “first”, “second”, “third”, etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having the same name (but for use of the ordinal term) to distinguish the claim elements.
In some embodiments, the hardware of the node 110 (hereinafter referred to as node 0) is configured to provide only a data-storage service. Specifically, the processor 111 may access data of storage devices 115a and 115b via a controller 115 and access data of storage devices 117a and 117b via a controller 117. The storage devices 115a, 115b, 117a and 117b may be arranged into RAID (Redundant Array of Independent Disks) to provide a secure data-storage environment. The processor 111 is more suitable for executing tasks of mass-storage access. The storage devices 115a, 115b, 117a and 117b may provide non-volatile storage space for storing a wide range of electronic files, such as Web pages, documents, audio files, video files, etc. It should be understood that the processor 111 may connect more or fewer controllers and each controller may connect more or fewer storage devices and the invention should not be limited thereto.
In some embodiments, the hardware of the node 130 (hereinafter referred to as node 1) is configured to provide data-storage service and communications with peripherals. Specifically, the processor 131 may access data of storage devices 135a and 135b via a controller 135 and communicate with peripherals via peripheral controller 137 or 139. The peripheral controller 137 or 139 may be utilized to communicate with one I/O device. The I/O device may be a LAN (Local Area Network) communications module, a WLAN (Wireless Local Area Network), or a Bluetooth communications module, such as the IEEE 802.3 communications module, the 802.11x communications module, the 802.15x communications module, etc., to communicate with other electronic apparatuses using a given protocol. The I/O device may be a USB (Universal Serial Bus) module. The I/O device may be an input device, such as a mouse, a touch panel, etc., to generate position signals of the mouse pointer. The I/O device may be a display device, such as a TFT-LCD (Thin film transistor liquid-crystal display) panel, an OLED (Organic Light-Emitting Diode) panel, or others, to display input letters, alphanumeric characters and symbols, dragged paths, drawings, or screens provided by an application for the user to view. The processor 131 is more suitable for executing tasks of numerous I/O data transceiving. It should be understood that the processor 131 may connect more or fewer controllers and each controller may connect more or fewer storage devices and the invention should not be limited thereto. Furthermore, the processor 131 may connect more or fewer peripheral controllers and the invention should not be limited thereto.
The storage devices 115a, 115b, 117a and 117b may be referred to as local storage device of the processor 111 and the processor 111 or 131 may access data of the local storage device directly. The processor 111 may communicate with the processor 131 to use the storage devices 135a and 135b of the node 130 via the interconnect interface. The storage devices 135a and 135b may be referred to as the cross-node storage devices of the processor 111. Moreover, the processor 111 may communicate with the processor 131 to use the peripheral controllers 137 and 139 of the node 130 via the interconnect interface. The peripheral controllers 137 and 139 may be referred to as the cross-node peripheral controllers of the processor 111.
In some implementations, the OS kernel determines which one of the processors 111 and 131 is used to execute a task according to its frequencies for accessing local and cross-node memories. However, execution efficiency does not depend solely on memory access. In some hardware installations, the execution efficiency of a task may be greatly affected by the use of storage devices and I/O devices. Thus, embodiments of the invention introduce methods for determining processing nodes for executed tasks, which are practiced by a daemon when being loaded and executed by the processor 111 or 131. In a multitasking OS, a daemon is a computer program that runs as a background task after system booting, rather than being under the direct control of the user. When a task is executed by the processor 111 of the node 110, the daemon periodically obtains a first evaluation score associated with usages of the I/O devices of the node 110 by the task in a time interval and a second evaluation score associated with usage of the I/O devices of the node 130 by the task in the time interval. When the second evaluation score is higher than the first evaluation score, the execution of the task is switched to the processor 131 of the node 130.
The tasks described in embodiments of the invention are the minimum units that can be scheduled in the OS, including arbitrary processes, threads and kernel threads, for example, a FTP (File Transfer Protocol) server process, a keyboard driver process, an I/O interrupt thread, etc.
Assume that the daemon is preset to be executed by the processor 111 and I/O policies are preset to be stored in the storage device 115a: The I/O policies may be practiced in a file of a file system, a data table of a relational database, an object of an object database, or others, and contain usage weights of different I/O device types (such as storage devices and peripherals) for each application. Exemplary I/O policies are provided as follows:
As to the application A, its usage weight of peripherals being higher than that of storage devices means that the application A theoretically uses peripherals more frequently than storage devices. As to the application C, its usage weight of storage devices being higher than that of peripherals means that the application C theoretically uses storage devices more frequently than peripherals. As to the application B, its usage weight of storage devices being the same as that of peripherals means that the application B theoretically uses storage devices substantially equal to peripherals. Although table 1 describes the usage weights as integers, those skilled in the art may devise usage weights with other types of numbers and the invention should not be limited thereto. For example, the usage weights of storage devices and peripherals for the application A may be set to 0.33 and 0.67, respectively, and the usage weights of storage devices and peripherals for the application B may be set to 0.5 and 0.5, respectively. The memory 113 is used to store and maintain evaluation scores of storage devices and peripherals for each task, thereby enabling the daemon to determine whether each task is to be executed by the processor 111 or the processor 131. The memory 113 may store an evaluation table to facilitate the calculation of evaluation scores and the determination of nodes for each task. The evaluation table may be practiced in one two-dimensional array, multiple one-dimensional arrays, or similar but different data structures. An exemplary evaluation table is provided below:
The evaluation table contains multiple records and each record stores necessary information for calculating evaluation scores for one task. For example, the evaluation table contains records of tasks T1 and T2. Each record stores a task ID, the I/O policies of the application associated with the task, the statuses indicating how has the task used storage devices and peripherals of the node 110 and storage devices and peripherals of the node 130, the evaluation scores of the node 110 and the node 130, and a determination result. The statuses indicating how has the task used I/O devices of different types of a particular node in the time interval are represented by numbers. In some embodiments, the number may indicate whether the task has used I/O devices of a particular type of a particular node in the time interval, where “1” indicates yes and “0” indicates no. In some embodiments, the number may indicate the quantity of I/O devices of a particular type of a particular node, which has/have been used by the task in the time interval.
Although the embodiments describe how the processor 111 is used to execute the daemon by default, it is not intended to limit the daemon to only being executed by the processor 111. The execution of the daemon can be migrated to another processor. The OS may migrate the daemon's execution to another processor for a particular purpose or at a specific moment and the invention should not be limited thereto.
Specifically, the processor 111 detects whether the I/O policies of the storage device 115a have been changed (step S210). In some embodiments of step S210, when the hardware installation has been changed (such as a new storage device or peripheral has been inserted into a node, or a storage device or peripheral has been removed from a node, etc.), the I/O policies of the storage device 115a are updated accordingly. In some other embodiments of step S210, the I/O policies of the storage device 115a are updated via MMI (Man Machine Interface) by the user. When the I/O policies of the storage device 115a have been changed (the “Yes” path of step S210), the processor 111 updates the I/O policies for the different types of I/O devices of each task, which are stored in the evaluation table of the memory 113, according to the I/O policies of the storage device 115a (step S271), calculates evaluation scores of all nodes for each task according to the updated I/O policies for different types of I/O devices of this task and the usage statuses of different types of I/O devices by this task in the time interval, and writes the calculated evaluation scores in the evaluation table of the memory 113 (step S273), determines which node will execute each task according to the calculation results of the evaluation table, and writes the determination results in the evaluation table of the memory 113 (step S275), and, if required, switches the execution of one or more tasks to the proper node or nodes according to the determination results produced in step S275 (step S277). When the I/O policies of the storage device 115a have not been changed (the “No” path of step S210), the processor 111 calculates the evaluation scores of all nodes for each task according to the I/O policies for different types of I/O devices of this task, which are stored in the evaluation table of the memory 113, and the usage statuses of different types of I/O devices by this task in the time interval, and writes the calculated evaluation scores in the evaluation table of the memory 113 (step S273), determines which node will execute each task according to the calculation results of the evaluation table, and writes the determination results in the evaluation table of the memory 113 (step S275), and, if required, switches the executions of one or more tasks to proper node or nodes according to the determination results (step S277).
In step S271, the processor 111 may repeatedly perform a loop for updating the I/O policies for different types of I/O devices of each task, which is stored in the evaluation table of the memory 113. The memory 113 may store information regarding an application associated with each executed task. In each iteration, the processor 111 selects a task of the evaluation table, which has not been updated, searches for the application that this task is associated with according to the information stored in the memory 113, searches usage weights of different I/O device types for the associated application according to the I/O policies of the storage device 115a, and updates the usage weights of different I/O device types of this task of the evaluation table with the found ones. For example, when tasks T1 and T2 are respectively associated with applications A and C, the updated evaluation table is as follows:
In step S273, specifically, the daemon may obtain the statuses indicating how has each task used the I/O devices of different types of different nodes in the time interval via API (Application Programming Interface) by the OS.
In some embodiments, the affinity module 311 may use Equation (1) to calculate evaluation scores associated with the node 110 for each task:
S1=Σi=1m1(wi×c1,i),
where S1 represents the evaluation score associated with the node 110, m1 represents the total amount of types of I/O devices of the node 110, wi represents the usage weight of the ith type of I/O devices for the application associated with this task, and c1,i represents the status indicating how has this task used the ith type of I/O devices of the node 110 in the time interval. The affinity module 311 may use Equation (2) to calculate evaluation scores associated with the node 130 for each task:
S2=Σi=1m2(wi×c2,i),
where S2 represents the evaluation score associated with the node 130, m2 represents the total amount of types of I/O devices of the node 130, wi represents the usage weight of the ith type of I/O devices for the application associated with this task, and c2,i, represents the status indicating how has this task used the ith type of I/O devices of the node 130 in the time interval. Subsequently, the processor affinity module 311 writes the calculation results in the evaluation table of the memory 113. The updated evaluation table is provided as follows:
In other embodiments, the processor affinity module 311 may omit the usage weight of the ith type of I/O devices for the application associated with this task. The processor affinity module 311 may use Equation (3) to calculate evaluation scores associated with the node 110 for each task:
S1=Σi=1m1c2,i.
The processor affinity module 311 may use Equation (4) to calculate evaluation scores associated with the node 130 for each task:
S2=Σi=1m2c2,i.
In step S275, for each specific task, the processor affinity module 311 determines the node with the highest evaluation score to execute this task and writes the decisions in the evaluation table of the memory 113. The updated evaluation table is provided as follows:
The memory 113 may store information indicating which node is currently executing each task. In step S277, specifically, the processor affinity module 311 may repeatedly perform a loop to move each task that needs to be switched to a processor of a proper node to be executed. In each iteration, the processor affinity module 311 selects from the evaluation table a task that has not been processed, and determines whether the execution of this task needs to be switched to a processor of a proper node according to the decision of the evaluation table for this task and information indicating which node is currently executing this task. When determining that this task needs to be switched, the processor affinity module 311 instructs the kernel affinity control interface 335 to switch execution of this task to the determined node. Subsequently, the kernel affinity control interface 335, through the kernel scheduler 337, moves the context of this task to the memory of the determined node, and arranges this task in a schedule of the processor of the determined node. Assume that the task T1 is currently executed by the processor 111 of the node 110: The processor affinity module 311 instructs the kernel affinity control interface 335 to switch the execution of the task T1 to the node 130. It should be understood that the kernel affinity control interface 335 may be loaded and executed by the processor 111 or 131.
In some embodiments, those skilled in the art may devise the aforementioned method to further take usage rates of the processors and access frequencies to the memories into account for determining whether execution of a task needs to switch to the processor of another node. For example, when a task is executed by the processor 111 of the node 110, the daemon periodically obtains a first evaluation score associated with usages of the I/O devices, the processor and the memory of the node 110 by the task in a time interval and a second evaluation score associated with usages of the I/O devices, the processor and the memory of the node 130 by the task in the time interval. When the second evaluation score is higher than the first evaluation score, the execution of the task is switched to the processor 131 of the node 130.
Although the embodiment has been described as having specific elements in
While the invention has been described by way of example and in terms of the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. On the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Number | Date | Country | Kind |
---|---|---|---|
105132698 | Oct 2016 | TW | national |