In a high performance computer system, such as a real time control system, precise measurement of execution time of any individual operation or set of operations in a computer program is important for identifying potential areas for improvement. However, measuring performance of a computer system can affect the performance of the computer system. Ideally, any technique to measure execution time in a high performance computer system should maintain and not adversely impact any performance guarantees of the computer system, such as real time performance, while providing microsecond precision and utilizing minimal memory resources.
Such constraints on measuring execution time in a high performance computer system are particularly challenging if the computer system supports concurrent operations by different independent portions of a computer program or by different computer programs. These challenges are exacerbated if use of the computer system is outside the control of the developer of the computer system, such as with a consumer device. In such use, different computer systems have different resources, applications, versions, updates, usage patterns, and so on.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is intended neither to identify key or essential features, nor to limit the scope, of the claimed subject matter.
A computer system supports measuring execution time of concurrent operations by different independent portions of a computer program or by different computer programs. An independent portion of a computer program, herein called a thread, includes thread local storage accessible only to that thread during execution of the thread by its processor. During execution, the thread also has access to a high performance system timer, which drives the timing of the processor, to allow sampling of the system timer with microsecond or better precision with a single instruction. The thread allocates a timing buffer in the thread local storage.
For any sequence of instructions within the thread for which execution time is to be measured, the sequence of instructions has an identifier and includes two commands, herein called a start command and an end command. The start command is an instruction at the beginning of the sequence of instructions to be measured; the end command is an instruction at the end of the sequence of instructions to be measured. The start command samples the system timer to obtain a start time, and stores the identifier and the start time in the timing buffer in the thread local storage. The end command samples the system timer to obtain an end time, and updates the data for the corresponding identifier in the timing buffer, to indicate an elapsed time for execution of the sequence of instructions. The elapsed time can be so indicated, for example, by storing the start time and the end time, or by computing and storing the difference between the start time and the end time. The start command and end command each can be implemented as a single executable instruction.
With a computer system that can execute multiple concurrent threads, execution time for sequences of instructions in concurrent threads can measured using these techniques in a lock-less fashion, because each thread accesses its own thread local storage to store timing data. Further, the execution time can be measured with microsecond, or better, precision, because the system timer is sampled just at the beginning and end of execution of the sequence of instructions for which execution time is being measured. Additionally, execution time can be measured with minimal impact on performance, by using single executable instructions to capture start times and end times and by using a relatively small timing buffer in thread local storage.
The data in the timing buffers for multiple threads can be collected and stored by the computer program for later analysis. For example, in response to termination of execution of a thread, or the computer program including the thread, or in response to some other event, the timing buffers allocated by the computer program can be collected and stored by, for example, the computer program or by the operating system.
Using such techniques, any computer program also can be written to allow execution time to be measured for any sequence of instructions in a thread of the computer program. In one implementation, source code of the computer program can be annotated with keywords indicating a start point of a sequence of instructions for which execution time is to be measured, and an end point of that sequence of instructions. A compiler or pre-compiler can process such keywords so as to assign identifiers to the corresponding sequences of instructions, and to insert corresponding instructions (implementing the start command and the end command) in the computer program.
In the following description, reference is made to the accompanying drawings which form a part hereof, and in which are shown, by way of illustration, specific example implementations. Other implementations may be made without departing from the scope of the disclosure.
The computer can be any of a variety of general purpose or special purpose computing hardware configurations. Some examples of types of computers that can be used include, but are not limited to, personal computers, game consoles, set top boxes, hand-held or laptop devices (for example, media players, notebook computers, tablet computers, cellular phones including but not limited to “smart” phones, personal data assistants, voice recorders), server computers, multiprocessor systems, microprocessor-based systems, programmable consumer electronics, networked personal computers, minicomputers, mainframe computers, and distributed computing environments that include any of the above types of computers or devices, and the like.
With reference to
The memory 1004 may include volatile computer storage devices (such as dynamic random access memory (DRAM) or other random access memory device), and non-volatile computer storage devices (such as a read-only memory, flash memory, and the like) or some combination of the two. A nonvolatile computer storage device is a computer storage device whose contents are not lost when power is removed. Other computer storage devices, such as dedicated memory or registers, also can be present in the one or more processors. The computer 1000 can include additional computer storage devices (whether removable or non-removable) such as, but not limited to, magnetically-recorded or optically-recorded disks or tape. Such additional computer storage devices are illustrated in
A computer storage device is any device in which data can be stored in and retrieved from addressable physical storage locations by the computer. A computer storage device thus can be a volatile or nonvolatile memory, or a removable or non-removable storage device. Memory 1004, removable storage 1008 and non-removable storage 1010 are all examples of computer storage devices. Some examples of computer storage devices are RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optically or magneto-optically recorded storage device, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Computer storage devices and communication media are mutually exclusive categories of media, and are distinct from the signals propagating over communication media.
Computer 1000 may also include communications connection(s) 1012 that allow the computer to communicate with other devices over a communication medium. Communication media typically transmit computer program instructions, data structures, program modules or other data over a wired or wireless substance by propagating a modulated data signal such as a carrier wave or other transport mechanism over the substance. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal, thereby changing the configuration or state of the receiving device of the signal. By way of example, and not limitation, communication media includes wired media, such as metal or other electrically conductive wire that propagates electrical signals or optical fibers that propagate optical signals, and wireless media, such as any non-wired communication media that allows propagation of signals, such as acoustic, electromagnetic, electrical, optical, infrared, radio frequency and other signals.
Communications connections 1012 are devices, such as a wired network interface, wireless network interface, radio frequency transceiver, e.g., WiFi 1070, cellular 1074, long term evolution (LTE) or Bluetooth 1072, etc., transceivers, navigation transceivers, e.g., global positioning system (GPS) or Global Navigation Satellite System (GLONASS), etc., network interface devices 1076, e.g., Ethernet, etc., or other device, that interface with communication media to transmit data over and receive data from the communication media.
The computer 1000 may have various input device(s) 1014 such as a pointer device, keyboard, touch-based input device, pen, camera, microphone, sensors, such as accelerometers, thermometers, light sensors and the like, and so on. The computer 1000 may have various output device(s) 1016 such as a display, speakers, and so on. Such devices are well known in the art and need not be discussed at length here. Various input and output devices can implement a natural user interface (NUI), which is any interface technology that enables a user to interact with a device in a “natural” manner, free from artificial constraints imposed by input devices such as mice, keyboards, remote controls, and the like.
Examples of NUI methods include those relying on speech recognition, touch and stylus recognition, gesture recognition both on screen and adjacent to the screen, air gestures, head and eye tracking, voice and speech, vision, touch, gestures, and machine intelligence, and may include the use of touch sensitive displays, voice and speech recognition, intention and goal understanding, motion gesture detection using depth cameras (such as stereoscopic camera systems, infrared camera systems, and other camera systems and combinations of these), motion gesture detection using accelerometers or gyroscopes, facial recognition, three dimensional displays, head, eye, and gaze tracking, immersive augmented reality and virtual reality systems, all of which provide a more natural interface, as well as technologies for sensing brain activity using electric field sensing electrodes (EEG and related methods).
The various computer storage devices 1008 and 1010, communication connections 1012, output devices 1016 and input devices 1014 can be integrated within a housing with the rest of the computer, or can be connected through various input/output interface devices on the computer, in which case the reference numbers 1008, 1010, 1012, 1014 and 1016 can indicate either the interface for connection to a device or the device itself as the case may be.
A computer generally includes an operating system, which is a computer program that manages access, by applications running on the computer, to the various resources of the computer. There may be multiple applications. The various resources include the memory, storage, input devices and output devices, such as display devices and input devices as shown in
The various modules, tools, or applications, and data structures and flowcharts of
Alternatively, or in addition, the functionality of one or more of the various components described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-programmable Gate Arrays (FPGAs), Program-specific Integrated Circuits (ASICs), Program-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.
Given such a computer as shown in
For simplicity herein, an independent portion of a computer program is herein called a thread. In the examples below, example operation of the system is described in the context of concurrent execution of two threads. In these examples, the two threads can be two different independent portions of a computer program, two different instances of the same independent portion of a computer program, or two independent portions of two different computer programs. Further, in practice, the term thread may be used differently with respect to different operating systems and/or computers. Thus, the term “thread” herein is intended to mean a sequence of programmed instructions that can be managed independently by an operating system and for which thread local storage can be allocated in memory in a manner accessible only to that thread during execution of the thread. Such thread local storage generally can be allocated by an application program through an application programming interface provided by the operating system or through constructs provided by a programming language.
Accordingly, turning to
Turning now to
Turning now to
This example illustrates how a computer program operates when it includes a thread for which execution time for a sequence of instructions is measured. While the illustration includes discussion of a single thread and a single sequence of instructions, it should be understood that the thread can include multiple different sequences of instructions for which execution time can be measured. Such a computer program can include multiple threads that execute concurrently, each of which can include one or more sequences of instructions for which execution time can be measured. It should be understood that multiple computer programs can execute concurrently as well, each of which having one or more threads including one or more sequences of instructions for which execution time is measured.
As shown in
With such capabilities being provided in a computer system, any computer program also can be written to allow execution time to be measured for any sequence of instructions in a thread of the computer program. In one implementation, a developer can insert, into source code, start commands and end commands for any sequence of instructions with an identifier for which execution time is to be measured.
In one implementation, described now in connection with
The sequences of instructions are delimited by one or more tags, e.g., in this example for purposes of illustration only, a “<Measure this>” tag (502) to mark the start of the sequence of instructions and a “</Measure this>” tag (504) to mark the end of the sequence of instructions. In this example for purposes of illustration only, the tags are illustrated in the form of a markup tag such as an XML tag. The choice of form and content of the tag can be arbitrary so long as the tag is not a reserved keyword or symbol in the computer programming language used for the source code and is otherwise unique. Different start and end tags can be used, or a single tag can be used to designate both start and end, with context being used to differentiate a start from an end. Tags can have syntax such that they can include additional data.
Given source code that includes such tags, the source code can be processed, for example by a pre-compiler or compiler, to identify the tags, and thus the sequences of instructions for which execution time is to be measured. Each sequence of instructions so identified can be assigned a unique identifier through such processing. Thus, a developer of the source code can simply mark the sequences of instructions with the keyword and not be concerned with assigned unique identifiers to the sequences of instructions. Using a pre-compiler implementation, source code instructions can be inserted in the source code in place of the tags to as to provide the start command and end command for capturing execution time data. Using a compiler implementation, such tags can be converted into executable instructions for the start and end commands.
With a computer system that can execute multiple concurrent threads, execution time for sequences of instructions in concurrent threads can measured using these techniques in a lock-less fashion, because each thread accesses its own thread local storage to store timing data. Further, the execution time can be measured with microsecond, or better, precision, because the system timer is sampled just at the beginning and end of execution of the sequence of instructions for which timing is being measured. Additionally, execution time can be measured with minimal impact on performance, by using single executable instructions to capture start times and end times and by using a relatively small timing buffer in thread local storage. Using such techniques, any computer program also can be written to allow execution time to be measured for any sequence of instructions in a thread of the computer program.
Accordingly, in one aspect, a computer comprises a processing system comprising a processing unit and a memory and having a system timer. The processing system, for a first thread to be executed by the processing system, allocates a first buffer in first thread local storage in the memory. For a second thread to be executed concurrently by the processing system, and different from the first thread, the processing system allocates a second buffer separate from the first buffer and in second thread local storage in the memory. In response to execution of a first start command at a beginning of a first sequence of instructions for the first thread, the processing system stores, in the first buffer, an identifier of the first sequence of instructions and a first start time from the system timer at the time of execution of the first start command. In response to execution of a first end command at an end of the first sequence of instructions for the first thread, the processing system stores, in the first buffer and in association with the identifier of the first sequence of instructions, data indicative of an elapsed time between the first start time stored in the first buffer and a first end time from the system timer at the time of execution of the first end command. In response to execution of a second start command at a beginning of a second sequence of instructions in the second thread, the processing system stores, in the second buffer, an identifier of the second sequence of instructions and a second start time from the system timer at a time of execution of the second start command. In response to execution of a second end command at an end of the second sequence of instructions for the second thread, the processing system stores, in the second buffer and in association with the identifier of the second sequence of instructions, data indicative of an elapsed time between the second start time stored in the second buffer and a second end time from the system timer at the time of execution of the second end command.
In another aspect, a computer-implemented process performed by a computer program executing on a processing system of a computer, the computer comprising a processing system having a system timer and memory accessible by threads executed by the processing system, comprises for a first thread to be executed by the processing system, allocating a first buffer in first thread local storage in the memory. For a second thread to be executed concurrently by the processing system, and different from the first thread, a second buffer is allocated separate from the first buffer and in second thread local storage in the memory. In response to execution of a first start command at a beginning of a first sequence of instructions for the first thread, an identifier of the first sequence of instructions and a first start time from the system timer at the time of execution of the first start command are stored in the first buffer. In response to execution of a first end command at an end of the first sequence of instructions for the first thread, data indicative of an elapsed time between the first start time stored in the first buffer and a first end time from the system timer at the time of execution of the first end command are stored in the first buffer and in association with the identifier of the first sequence of instructions. In response to execution of a second start command at a beginning of a second sequence of instructions in the second thread, an identifier of the second sequence of instructions and a second start time from the system timer at a time of execution of the second start command are stored in the second buffer. In response to execution of a second end command at an end of the second sequence of instructions for the second thread, data indicative of an elapsed time between the second start time stored in the second buffer and a second end time from the system timer at the time of execution of the second end command are stored in the second buffer and in association with the identifier of the second sequence of instructions.
In another aspect, a computer comprises: a means for allocating, for a first thread, a first buffer in first thread local storage in a memory and means for allocating, for a second concurrent thread, a second buffer in second thread local storage in a memory; a means for storing a start time from the system timer in the first buffer in response to execution of a start command at a beginning of the first thread; a means for storing a start time from the system time in the second buffer in response to execution of a start command at a beginning of the second thread; a means for storing, in the first buffer, data indicative of an elapsed time between the first start time stored in the first buffer and a first end time from the system timer at the time of execution of a first end command; a means for storing, in the second buffer, data indicative of an elapsed time between the second start time stored in the second buffer and a second end time from the system timer at the time of execution of a second end command.
In another aspect, a computer includes means for processing source code, the source code comprising marked sequences of instructions, to insert a start command at a beginning of a marked sequence of instructions and an end command at an end of a marked sequence of instructions, such that when executable code derived from the source code is executed, execution of the start command causes an identifier of the sequence of instructions and a start time from the system timer at the time of execution of the start command to be stored in a buffer in thread local storage, and execution of the end command data indicative of an elapsed time between the start time stored in the buffer and an end time from the system timer at the time of execution of the end command are stored in the buffer and in association with the identifier of the sequence of instructions.
In another aspect, a computer-implemented process processes source code, the source code comprising marked sequences of instructions, to insert a start command at a beginning of a marked sequence of instructions and an end command at an end of a marked sequence of instructions, such that when executable code derived from the source code is executed, execution of the start command causes an identifier of the sequence of instructions and a start time from the system timer at the time of execution of the start command to be stored in a buffer in thread local storage, and execution of the end command data indicative of an elapsed time between the start time stored in the buffer and an end time from the system timer at the time of execution of the end command are stored in the buffer and in association with the identifier of the sequence of instructions.
In any of the foregoing aspects, the first thread and second thread can be executed by different processing units. For example, the first thread can be executed by a first processing core of the processing system and the second thread can be executed by a second processing core, different from the first processing core, of the processing system. As another example, the first thread can be executed by a central processing unit and the second thread can be executed by a graphics processing unit.
In any of the foregoing aspects, the first thread and the second thread are different sequences of computer program instructions. For example, the first thread and second thread can be different threads of a same computer program. As another example, the first thread and the second thread can be threads of different computer programs.
In any of the foregoing aspects, the start command samples the system timer and stores the current time with the identifier in the timing buffer in a single executable instruction.
In any of the foregoing aspects, the end command samples the system timer and stores data indicative of an elapsed time in the timing buffer in a single executable instruction.
In another aspect, an article of manufacture includes at least one computer storage device, and computer program instructions stored on the at least one computer storage device. The computer program instructions, when processed by a processing system of a computer, the processing system comprising one or more processing units and memory accessible by threads executed by the processing system, and having a system timer, configures the computer as set forth in any of the foregoing aspects and/or performs a process as set forth in any of the foregoing aspects.
Any of the foregoing aspects may be embodied as a computer system, as any individual component of such a computer system, as a process performed by such a computer system or any individual component of such a computer system, or as an article of manufacture including computer storage in which computer program instructions are stored and which, when processed by one or more computers, configure the one or more computers to provide such a computer system or any individual component of such a computer system.
It should be understood that the subject matter defined in the appended claims is not necessarily limited to the specific implementations described above. The specific implementations described above are disclosed as examples only. What is claimed is: