Embodiments of the invention relate to the management of processor cores in a multicore processor system.
Most modern computing systems provide hot-plug support that allows a processor core to be powered on or off, or physically inserted or removed, during operating system (OS) runtime without restarting the system. In a multicore processor system that supports hot-plug, the OS can unplug a processor core to remove it from the system and can replug it back into the system on demand, without the processor core being physically unplugged or re-plugged. A hot-pluggable system is adaptable to the changing capacity demand, as processor cores can be dynamically provisioned on demand. Moreover, for system reliability purposes, a hot-pluggable system can remove a faulty processor core during OS runtime, keeping the processor core off the system execution path.
When a processor core is hot-plugged from a system, the processor core is offline from the OS kernel's standpoint, and a part or all of its file system is removed. Typically, in a multicore processor system, one of the processor core is designated as a default keeper of system information. For example, a user space application may send an inquiry to the default processor core to find out an operating status of the system. A problem arises when the default processor core is offline, e.g., hot-plugged, from the system. Some user space applications may be unaware of the offline state of the default processor core, and may continue to send inquiries to the default processor core. The responses, if any, to such inquiries are unreliable and unpredictable. Therefore, there is a need for providing reliable system information in response to inquiries when the inquiries are sent to an offline processor core in a multicore processor system.
In one embodiment, a method is provided for mapping processor cores in a multicore system that includes a plurality of processor cores. The method comprises: detecting that an offline processor core is present among the plurality of processor cores; mapping the offline processor core to a mapped processor core, which is selected from an emulated processor core and one or more online processor cores among the plurality of processor cores, wherein the emulated processor core is a software construct containing an emulated state of the offline processor core; re-directing a system call to the mapped processor core, wherein the system call is sent from a requestor to the offline processor core to request for system information from the offline processor core; and returning the system information from the mapped processor core to the requestor in response to the system call.
In another embodiment, a system includes a plurality of processor cores and memory. The memory contains instructions executable by the plurality of processor cores. The system is operative to detect that an offline processor core is present among the plurality of processor cores; map the offline processor core to a mapped processor core, which is selected from an emulated processor core and one or more online processor cores among the plurality of processor cores, wherein the emulated processor core is a software construct containing an emulated state of the offline processor core; re-direct a system call to the mapped processor core, wherein the system call is sent from a requestor to the offline processor core to request for system information from the offline processor core; and return the system information from the mapped processor core to the requestor in response to the system call.
According to embodiments described herein, a multicore processor system provides a mapping of processor cores such that a request for system information can be made to an offline processor core.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that different references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean at least one. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
In the following description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in detail in order not to obscure the understanding of this description. It will be appreciated, however, by one skilled in the art, that the invention may be practiced without such specific details. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
It should be noted that the term “multicore processor system” as used herein may be arranged and managed as one or more clusters. A multicore processor system may be a multicore system, a multi-processor system, or a combination of both, depending upon the system implementation. In other words, the proposed method may be applicable to any multicore system and multi-processor system arranged and managed as one or more clusters. A “processor core” as used herein may be a core, a processor, a central processing unit (CPU), a graphics processing unit (GPU), or a processing element of any kind. A “cluster” as used herein may be a group of cores, processors, CPUs, GPUs or processing elements of any kind.
Furthermore, a processor core is offline when it is deactivated from the OS kernel's standpoint. That is, an offline processor core is removed from a list of processor cores which receive task assignments from a task scheduler. An offline processor core is in a power-off state or an ultra-low power state. A processor core in the ultra-low power state receives just enough power to retain data in its caches but not enough to afford logical calculations. A non-limiting example of an offline processor core is a processor core that is hot-plugged (i.e., unplugged) from a system. A processor core is online when it is activated from the OS kernel's standpoint. That is, an online processor is responsive to inquiries and task assignments. An online processor core may be actively executing tasks, or is ready for task assignments if it has not yet received a task assignment. An online processor core is in a power-on state.
In the following description, the term “processor core,” unless specified otherwise, refers to a physical processor core in the multicore processor system. A “logical processor core” refers to a logical or virtual entity receiving task assignments for execution, but the actual execution is performed on a physical processor core to which the logical processor core is mapped. The term “emulated processor core” refers to a software construct that includes a set of data structures to store the emulated state of one or more offline processor cores. The term “active processor core” refers to an online processor core (which is also a physical processor core) that is in active operation.
Embodiments of the invention provide a system and method for mapping an offline processor core to another processor core (referred to as a “mapped processor core”) in a multicore processor system. The mapped processor core may be an online processor core, an active processor core or an emulated processor core. The offline processor core may be any of the processor cores in the multicore processor system. In one embodiment, the offline processor core may be the default processor core that keeps system information and exports system information to user space applications upon request.
In one embodiment, the multicore processor system 100 includes a multi-level cache structure. For example, each processor core may include or has exclusive access to an L1 cache, and processor cores of the same cluster may share the same L2 cache or additional levels of caches. In another embodiment, each processor core may include or has exclusive access to an L1 cache and an L2 cache, and processor cores of the same cluster may share the same L3 cache or additional levels of caches. It is noted that in further other embodiments, each processor core may include or has exclusive access to one or more caches, and processor cores of the same cluster may share additional one or more caches. In addition to the shared cache(s), processor cores of the same cluster may share other hardware components such as a memory interface, timing circuitry, and other shared circuitry. Processor cores in each cluster also have access to a system memory 130 via an interconnect 110.
In one embodiment, the system memory 130 stores an OS kernel 150 containing instructions executable by the multicore processor system 100 for managing system resources. The OS kernel 150 includes, controls or manages a number of software modules, such as a driver module 111, a file system 112, a virtual file system 113, a system call interface 114, libraries 115, and user space applications 116. The driver module 111 may include a dynamic voltage frequency scaling (DVFS) driver, which dynamically adjusts the operating voltage and frequency of each processor core to manage power and performance of the processor core. The file system 112 provides data structures for the OS kernel 150 to store code and data, as well as keeping track of system resources and files. The virtual file system 113 provides an abstract layer on top of the file system 112 to facilitate access to different types of file systems. In an embodiment where the OS kernel 150 is a Linux® kernel, the file system 112 may include sysfs, which stores system information about various kernel subsystems, hardware devices, and associated device drivers. The system call interface 114 handles the communications between the user space and the system components. The libraries 115 includes a standard library, which provides constructs, routines and definitions for programming languages such as the C programming language and/or other programming languages. The user space applications 116, when executed by any of the processor cores, may call directly or indirectly the libraries 115, the system call interface 114 and the virtual file system 113 to access system utilities and system information.
In one embodiment, the software modules 111-116 may be stored in the system memory 130 or other non-transitory computer readable medium accessible by the multicore processor system 100. The software modules 111-116 may be executed by any of the processor cores in the multicore processor system 100. It is understood that the OS kernel 150 may include, control or manage additional or alternative software modules that are not shown in
Alternatively, in the embodiment of
Thus, when the DVFS driver 211 commands an offline processor core (e.g., P0) to adjust its operating parameters such as the frequency or voltage, the mapped processor core (P1 in
As shown in
In another embodiment of
When the DVFS driver 211 commands LP0 to change its operating frequency, the mapped processor core (P1 in
Although P0 is used as an example of an offline processor core, it should be understood that the embodiments described hereinafter are applicable to any of the processor cores in a multicore processor system.
To manage the mapping between the processor cores, the alias manager 212 maps files in a file system 412 (e.g., sysfs) between the corresponding processor cores. The file system 412 includes a collection of data structures, which further include a device directory 450. Under the device directory 450, each processor core (including the emulated processor core Pn) is associated with a device file; e.g., “device P0” for processor core P0, “device P1” for processor core P1, etc. Each of these device files records the operating parameters of the corresponding processor core, including but not limited to the operating frequency and the operating voltage of the corresponding processor core. When P0 goes offline, the pointer pointing to “device P0” may be aliased or linked to another device file such as “device Pi”, where i=1, n. In one embodiment, a symbolic link may be created to link “device P0” to “device Pi.” The alias or the symbolic link allow a query to “device P0” to be re-directed to “device Pi.” Thus, when an adjustment is made to the operating parameters of an offline processor core (e.g., P0), the device file associated with the mapped processor core (Pi) records the adjustment until P0 goes back online. When P0 is back online, the alias or the symbolic link is removed.
In one embodiment, the method 700 begins when the system detects that an offline processor core is present among the plurality of processor cores (step 710). The system maps the offline processor core to a mapped processor core, which is selected from an emulated processor core and the one or more online processor cores (step 720). The emulated processor core is a software construct containing an emulated state of the offline processor core. When the system receives a system call that is sent from a requestor to the offline processor core to request for system information from the offline processor core, the system re-directs that system call to the mapped processor core (step 730). The system then returns the system information from the mapped processor core to the requestor in response to the system call (step 740).
In one embodiment, the online processor core that is selected as the mapped processor core is a processor core that has power consumption characteristics and/or computation performance closest to the offline processor core. Alternatively, the mapped processor core may be a processor core having any power consumption characteristics and computation performance. In one embodiment, the system call may include an inquiry about an operating frequency of the offline processor core, and the mapped processor core returns its operating frequency (if the mapped processor core is an online processor core) or an emulated operating frequency (if the mapped processor core is an emulated processor core). In another embodiment, the system call may include a command to adjust the operating frequency of the offline processor core, and the mapped processor core responds by adjusting its operating frequency (if the mapped processor core is an online processor core) or an emulated operating frequency (if the mapped processor core is an emulated processor core).
The operations of the flow diagrams of
Various functional components or blocks have been described herein. As will be appreciated by persons skilled in the art, the functional blocks will preferably be implemented through circuits (either dedicated circuits, or general purpose circuits, which operate under the control of one or more processors and coded instructions), which will typically comprise transistors that are configured in such a way as to control the operation of the circuity in accordance with the functions and operations described herein. The specific structure or interconnections of the transistors may be determined by a compiler, such as a register transfer language (RTL) compiler. RTL compilers operate upon scripts that closely resemble assembly language code, to compile the script into a form that is used for the layout or fabrication of the ultimate circuitry. RTL is well known for its role and use in the facilitation of the design process of electronic and digital systems.
While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, and can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.
This application claims the benefit of U.S. Provisional Application No. 62/262,417 filed on Dec. 3, 2015.
Number | Date | Country | |
---|---|---|---|
62262417 | Dec 2015 | US |