One challenge for improving performance of an application process is to minimize preemptions of the application process while running. For instance, when a thread runs to the end of its quantum and does not yield, the thread is preempted or another thread is scheduled to run. Thus, when a particular application process is preempted by a thread of another application process, the amount of time to complete the particular application process can increase, which can introduce or increase latency.
In some examples, an application process can be hard-affinitized to one or more cores of a central processing unit (CPU) so that the process's threads execute only on a designated set of cores. For instance, it may be desirable to run an application process that requires a large amount of processing power on a partitioned set of dedicated cores, where the set of cores is focused on handling the tasks of the application process. Accordingly, being able to partition and dedicate one or more cores of a host computing device to executing a particular application process is desirable for increasing application performance.
In some examples, a set of cores that are dedicated to a particular application process is preempted to run a thread of another application process. The preemption, for example, can be caused by affinitization configurations made prior to partitioning of the set of cores. For example, another application process that is affinitized, bound, or otherwise mapped to the dedicated cores can preempt the particular application process and cause/increase latency.
It is with respect to these and other considerations that examples have been made. In addition, although relatively specific problems have been discussed, it should be understood that the examples should not be limited to solving the specific problems identified in the background.
Examples described in this disclosure relate to systems and methods for providing cross-CPU partition preemption identification and removal. Examples of the present disclosure address various aspects of the foregoing challenge by implementing diagnosis of preemption events on a CPU having multiple cores. A preemption diagnostics system and method identify and remove cross-partition preemption events from a CPU partition such that the cores included in the CPU partition are dedicated to executing threads of a particular application process. As a result, bounded latency associated with running the application process can be achieved.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The present disclosure is illustrated by way of example by the accompanying figures, in which like references indicate similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.
Examples described in this disclosure relate to systems and methods for identifying and removing latency-inducing preemptions from a central processing unit (CPU) partition. A computing device typically includes a main CPU with multiple cores to execute instructions independently, cooperatively, or in other suitable manners. In some examples, a set of one or more cores are partitioned and dedicated to a particular application, where exclusive access of the set of cores in the partition is intended for running processes of the application. In some examples, some “noise” can be introduced in a CPU partition, where preemptions associated with cross-CPU partition processes can interrupt execution of the particular application. A preemption diagnostics system and method identify, reduce, remove, and/or otherwise limit additional latency caused by cross-partition events, such that the particular application has dedicated use of the cores in the partition. As a result, bounded latency associated with running the particular application processes can be achieved.
As shown, the computing device 100 further includes a memory 104, a user interface 108, and a network interface 106 operatively coupled to one another. The memory 104 includes volatile and/or non-volatile media and/or other types of computer-readable storage media configured to store data received from, as well as instructions for, a preemption diagnostics system 111 (e.g., instructions for performing the method discussed below with reference to
In some examples, the user interface 108 includes a data output device (e.g., visual display, audio speakers), and one or more data input devices. The data input devices may include combinations of one or more of keypads, keyboards, mouse devices, touch screens, microphones, speech recognition packages, gesture recognition packages, and any other suitable devices or other electronic/software selection methods.
In some examples, the network interface 106 includes wired and/or wireless communication interface components that enable the computing device 100 to transmit and receive data via a network. In various embodiments, the wireless interface component operates to communicate via cellular, WI-FI, BLUETOOTH, satellite transmissions, and/or so forth. The wired interface component may include a direct I/O interface, such as an Ethernet interface, a serial interface, a Universal Serial Bus (USB) interface, and/or so forth. For example, the computing device 100 comprises network capabilities for exchanging data with other electronic devices (e.g., laptop computers, servers, etc.) via one or more networks, such as the Internet.
According to examples, the multi-core processor 102 and the memory 104 of the computing device 100 store an operating system 112, the preemption diagnostics system 111, and applications 118a-n (collectively, applications 118). The operating system 112 includes components that enable the computing device 100 to receive data via various inputs (e.g., user controls, network interfaces, and/or memory devices), and process the data using the multi-core processor 102 to generate output. The operating system 112 may further include one or more components that provide the output to the user interface 108, the network interface 106, another portion of memory 104, etc. (e.g., display an image on an electronic display, store data in memory, transmit data to another electronic device).
In some examples, the operating system 112 includes other components that perform various other functions generally associated with an operating system, such as processing data for the applications 118. Various components, for example, work cooperatively to manage the threads that processes the data for the applications 118. The threads, for example, perform tasks of the applications 118. A thread, as used herein, refers to a single sequential flow of control within a program or application. In some examples, the various components include a dispatcher, a scheduler, and a queue. According to some examples, the dispatcher allocates cores 110 of the multi-core processor 102 to execute threads that perform different tasks (e.g., on a round robin basis, a fair use basis, a random order basis, or on some other basis). In some examples, the scheduler is responsible for managing the execution of threads by distributing computer resources to different threads. For instance, there can be hundreds of processes running at a same time that the multi-core processor 102 cannot handle simultaneously. Thus, the scheduler manages these processes and assigns them CPU time based on various factors. In some examples, the scheduler decides which process thread will run on which CPU core 110. According to examples, CPU affinity is a configurable setting that can force a process or thread to run on a particular core 110 or cores 110. Processes or threads that are affinitized to specific cores 110 will only run on the specified cores. For instance, the affinity table can indicate that a caller thread has an affinity for a first core 110a when the caller thread is allocated to execute on the first core 110a. In some examples, the queue stores information that indicates the affinities of the specific threads to particular cores 110.
In some examples, the applications 118 execute in a desktop environment provided by the operating system 112. The applications 118 can include processes that perform real-time tasks and/or non real-time tasks. In various examples, the applications 118 include productivity applications, gaming applications, entertainment applications, and/or communication applications. In at least one example, the applications 118 include an interface application that enables a system administrator to assign specific cores 110 of the multi-core processor 102 to execute threads that perform particular application process tasks.
In some examples, CPU partitioning is enabled, where a set of CPU cores 110 is isolated for exclusive use by the particular application 118. According to an example, when a set of cores 110 is dedicated to an application process, bounded latency is provided for the associated workload, where the application process tasks are bound to soft time limits. As can be appreciated, performance of an application 118 is improved by ensuring the set of cores 110 is dedicated to executing the particular application's tasks without preemptions (e.g., interruptions) from processes of other applications. As an example, preemptive multitasking enables a currently running task on a set of cores 110 to be interrupted and swapped out in order to start or continue running another task. Thus, CPU time is allocated to multiple application process tasks. Profile setting properties can be configured to assign an application process to use a given set of cores 110, where a violation (e.g., of a contract) occurs if another application process uses the given set of cores 110. As can be appreciated, completion time of an application process can be negatively impacted by preemptions associated with such violations.
While partitioning and dedicating one or more cores 110 to run a particular application process is intended to provide exclusive access to the one or more cores 110 for executing tasks corresponding to a particular application 118 and its preemptions, in some examples, some “noise” can be introduced in the CPU partition, where “noise” refers to preemptions associated with other application processes that interrupt execution of the particular application 118. For instance, in some cases, prior to dedicating the set of cores 110 to the particular application 118, various hardware or CPU interrupts may be distributed across various cores 110 of the multi-core processor 102. Thus, a preemption event associated with another process that was previously mapped or bound to a core 110 of the CPU partition, can interrupt processes of the particular application 118 while executing on the core 110 (e.g., although the core 110 may be dedicated to the particular application 118). Such interruptions can negatively impact completion time of the particular application 118. As an illustrative example, a CPU partition including a set of cores 110 is dedicated to running one or more processes of an application 118. In some examples, a latency bound is agreed upon (e.g., in a service level agreement (SLA)) for the CPU partition for ensuring acceptable application performance. Thus, it is desirable to identify, reduce, remove, and/or otherwise limit additional latency caused by cross-CPU partition events, such that the application 118 has dedicated use of the set of cores 110. As a result, the SLA for a bounded latency can be met.
Accordingly, a preemption diagnostics system 111 is provided to automatically identify and isolate violating preemptions from a set of cores 110 in a dedicated CPU partition. In some examples, the multi-core processor 102 communicates with the memory 104 to execute suitable instructions to provide the preemption diagnostics system 111. As shown in
The preemption diagnostics system 111 is configured to evaluate information logged in association with a preemption event (i.e., an event where a task preempts another task). The preemption event log, in some examples, is created when a preemption task (e.g., task preempting another task, herein referred to as a preemption) is assigned to a queue. According to an example, the information logged includes thread context information that identifies the application process that owns the thread corresponding to the preemption. Examples of preemption events include a scheduler event, where an application process is interrupted to run a task of a higher priority; a deferred procedure call (DCP), where the process is interrupted to run a required, lower-priority task; an interrupt service routine (ISR), where the process is interrupted by an ISR that is running at a higher interrupt request level; or other suitable preemption events. In some examples, the preemption diagnostics system 111 is configured to evaluate information logged in association with a preemption event when an application process is interrupted by preemptions for a quantum that meets or exceeds a time threshold (e.g., 100 microseconds).
When a preemption is determined to be a violating preemption, and thus a source of the preemption is not associated with the application 118 to which a CPU partition is dedicated, the preemption diagnostics system 111 further moves the source of the violating preemption from the dedicated CPU partition. In some examples, the preemption diagnostics system 111 modifies one or more system configuration settings to move and bind the source of the violating preemption to another core 110 (e.g., outside the dedicated CPU partition). In some examples, the preemption diagnostics system 111 refactors code corresponding to the source of the violating preemption so that it no longer generates threads that affinitize to the dedicated partition and preempt the dedicated application 118. Thereby, CPU partition SLAs can be honored. Accordingly, latency caused by cross-CPU partition violating preemptions is reduced, removed, and/or otherwise limited. Although
As an example, and as indicated by shading in
Accordingly, and with reference now to
With reference now to
At operation 404, an event log including information associated with the preemption event 204 is generated. For instance, the information recorded in the event log includes context information that identifies the application process 206 that owns the thread corresponding to the preemption 204. Operations 402 and 404 may loop during execution of the first application process 206a.
At operation 406, the event log is analyzed. For example, the preemption diagnostics system 111 evaluates the recorded information for determining a source of the preemption(s) 204. In some examples, an indication of a preemption 204 is received by the preemption diagnostics system 111 and the associated event log data is analyzed when the first application process 306a is interrupted for longer than a threshold quantum. In other examples, an indication of a preemption 204 is received in association with receiving a preemption event log.
At decision operation 408, a determination is made as to whether the determined source is the first application process 206a or another application process 206b-n. When a determination is made that the source of a logged preemption 204 is an application process 206b-n other than the first application process 206a, at operation 410, future preemptions associated with the other application process 206b-n are reassigned to another CPU partition 202b-n. Accordingly, described herein is a system for preventing a cross-CPU partition preemption, the system comprising: a processor; and memory storing instructions that, when executed by the processor, cause the system to: receive an indication of a preemption of a first application process running within a CPU partition dedicated to running the first application process; determine, based on a log event associated with the preemption, a source of the preemption is a second application process different from the first application process; determine the preemption is a violating preemption; and take a corrective action to prevent the source of the violating preemption from running in the dedicated CPU partition. For example, system configuration settings are modified, code is refactored, or other configurations are made by the preemption diagnostics system 111 to prevent the other application process 206b-206n from affinitizing to the cores 110 included in the dedicated CPU partition 202a. Accordingly, the first application process 206a is provided with full use of the resources included in the dedicated CPU partition 202a without interruptions from cross-partition events.
The operating system 505 may be suitable for controlling the operation of the computing device 500. Furthermore, aspects of the disclosure may be practiced in conjunction with a graphics library, other operating systems, or any other application program and is not limited to any particular application or system. This basic configuration is illustrated in
As stated above, a number of program modules and data files may be stored in the system memory 504. While executing on the processing unit 502, the program modules 506 may perform processes including one or more of the stages of the method 400 illustrated in
Furthermore, examples of the disclosure may be practiced in an electrical circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors. For example, examples of the disclosure may be practiced via a system-on-a-chip (SOC) where each or many of the components illustrated in
The computing device 500 may also have one or more input device(s) 512 such as a keyboard, a mouse, a pen, a sound input device, a touch input device, a camera, etc. The output device(s) 514 such as a display, speakers, a printer, etc. may also be included. The aforementioned devices are examples and others may be used. The computing device 500 may include one or more communication connections 516 allowing communications with other computing devices 518. Examples of suitable communication connections 516 include RF transmitter, receiver, and/or transceiver circuitry; universal serial bus (USB), parallel, and/or serial ports.
The term computer readable media as used herein includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program modules. The system memory 504, the removable storage device 509, and the non-removable storage device 510 are all computer readable media examples (e.g., memory storage.) Computer readable media include random access memory (RAM), read-only memory (ROM), electrically erasable programmable ROM (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 500. Any such computer readable media may be part of the computing device 500. Computer readable media does not include a carrier wave or other propagated data signal.
Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal. By way of example, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media.
One or more application programs 650 (e.g., one or more of the components of the preemption diagnostics system 111) is loaded into the memory 662 and run on or in association with the operating system 664. Other examples of the application programs 650 include videoconference or virtual meeting programs, phone dialer programs, e-mail programs, personal information management (PIM) programs, word processing programs, spreadsheet programs, Internet browser programs, messaging programs, and so forth. The system 602 also includes a non-volatile storage area 668 within the memory 662. The non-volatile storage area 668 may be used to store persistent information that should not be lost if the system 602 is powered down. The application programs 650 may use and store information in the non-volatile storage area 668, such as e-mail or other messages used by an e-mail application, and the like. A synchronization application (not shown) also resides on the system 602 and is programmed to interact with a corresponding synchronization application resident on a host computer to keep the information stored in the non-volatile storage area 668 synchronized with corresponding information stored at a remote device or server. As should be appreciated, other applications may be loaded into the memory 662 and run on the mobile computing device 600.
The system 602 has a power supply 670, which may be implemented as one or more batteries. The power supply 670 might further include an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries.
The system 602 may also include a radio 672 that performs the function of transmitting and receiving radio frequency (RF) communications. The radio 672 facilitates wireless connectivity between the system 602 and the “outside world,” via a communications carrier or service provider. Transmissions to and from the radio 672 are conducted under control of the operating system 664. In other words, communications received by the radio 672 may be disseminated to the application programs 650 via the operating system 664, and vice versa.
The visual indicator 620 (e.g., light emitting diode (LED)) may be used to provide visual notifications and/or an audio interface 674 may be used for producing audible notifications via the audio transducer 625. In the illustrated example, the visual indicator 620 is a light emitting diode (LED) and the audio transducer 625 is a speaker. These devices may be directly coupled to the power supply 670 so that when activated, they remain on for a duration dictated by the notification mechanism even though the processor 660 and other components might shut down for conserving battery power. The LED may be programmed to remain on indefinitely until the user takes action to indicate the powered-on status of the device. The audio interface 674 is used to provide audible signals to and receive audible signals from the user. For example, in addition to being coupled to the audio transducer 625, the audio interface 674 may also be coupled to a microphone to receive audible input, such as to facilitate a telephone conversation. The system 602 may further include a video interface 676 that enables an operation of a peripheral device port 630 (e.g., an on-board camera) to record still images, video stream, and the like.
A mobile computing device 600 implementing the system 602 may have additional features or functionality. For example, the mobile computing device 600 may also include additional data storage devices (removable and/or non-removable) such as, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
Data/information generated or captured by the mobile computing device 600 and stored via the system 602 may be stored locally on the mobile computing device 600, as described above, or the data may be stored on any number of storage media that may be accessed by the device via the radio 672 or via a wired connection between the mobile computing device 600 and a separate computing device associated with the mobile computing device 600, for example, a server computer in a distributed computing network, such as the Internet. As should be appreciated such data/information may be accessed via the mobile computing device 600 via the radio 672 or via a distributed computing network. Similarly, such data/information may be readily transferred between computing devices for storage and use according to well-known data/information transfer and storage means, including electronic mail and collaborative data/information sharing systems.
Examples include a computer-implemented method, for preventing a cross-central processing unit (CPU) partition preemption, comprising: receiving an indication of a preemption of a first application process running within a CPU partition dedicated to running the first application process; determining, based on a log event associated with the preemption, a source of the preemption is a second application process different from the first application process; determining the preemption is a violating preemption; and taking a corrective action to prevent the source of the violating preemption from running in the dedicated CPU partition
Examples include a system for preventing a cross-central processing unit (CPU) partition preemption, the system comprising: a processor; and memory storing instructions that, when executed by the processor, cause the system to: receive an indication of a preemption of a first application process running within a CPU partition dedicated to running the first application process; determine, based on a log event associated with the preemption, a source of the preemption is a second application process different from the first application process; determine the preemption is a violating preemption; and take a corrective action to prevent the source of the violating preemption from running in the dedicated CPU partition.
Examples include a computer-readable medium storing instructions that, when executed by a computer, cause the computer to: receive an indication of a preemption of a first application process running within a first central processing unit (CPU) partition dedicated to running the first application process, the first CPU partition including one or more cores; generate a log event associated with the preemption; determine, based on the log event, a source of the preemption is a second application process different from the first application process; and take a corrective action to affinitize the second application process to a core outside the first CPU partition.
It is to be understood that the methods, modules, and components depicted herein are merely examples. Alternatively, or in addition, the functionality described herein can be performed, at least in part, by one or more hardware logic components. For example, illustrative types of hardware logic components that can be used include Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application-Specific Standard Products (ASSPs), System-on-a-Chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc. In an abstract, but still definite sense, any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or inter-medial components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “coupled,” to each other to achieve the desired functionality. Merely because a component, which may be an apparatus, a structure, a system, or any other implementation of a functionality, is described herein as being coupled to another component does not mean that the components are necessarily separate components. As an example, a component A described as being coupled to another component B may be a sub-component of the component B, the component B may be a sub-component of the component A, or components A and B may be a combined sub-component of another component C.
The functionality associated with some examples described in this disclosure can also include instructions stored in a non-transitory media. The term “non-transitory media” as used herein refers to any media storing data and/or instructions that cause a machine to operate in a specific manner. Illustrative non-transitory media include non-volatile media and/or volatile media. Non-volatile media include, for example, a hard disk, a solid-state drive, a magnetic disk or tape, an optical disk or tape, a flash memory, an EPROM, NVRAM, PRAM, or other such media, or networked versions of such media. Volatile media include, for example, dynamic memory such as DRAM, SRAM, a cache, or other such media. Non-transitory media is distinct from, but can be used in conjunction with transmission media. Transmission media is used for transferring data and/or instruction to or from a machine. Examples of transmission media include coaxial cables, fiber-optic cables, copper wires, and wireless media, such as radio waves.
Furthermore, those skilled in the art will recognize that boundaries between the functionality of the above described operations are merely illustrative. The functionality of multiple operations may be combined into a single operation, and/or the functionality of a single operation may be distributed in additional operations. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.
Although the disclosure provides specific examples, various modifications and changes can be made without departing from the scope of the disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure. Any benefits, advantages, or solutions to problems that are described herein with regard to a specific example are not intended to be construed as a critical, required, or essential feature or element of any or all the claims.
Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.
Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements.