Modern general purpose and graphics processors may include one or more processing cores, and these processing cores may run a relatively large number of threads. Therefore, analyzing the performance of a multi-core processor may involve a complex undertaking given the number of tasks and the number of different threads that may be running.
Analyzing the performance of certain software may involve capturing a buffer of what each thread does in the process and using analysis tools to generate reports and visualizations of what occurred in the application. Challenges arise in comparing data collected across different application sessions, called “differencing.”
More specifically, conventionally, in a serially-executed application, differencing is relatively straightforward because the relative sequence of function calls or tasks is usually deterministic. As a result, a conventional differencing algorithm may scan the list of records in the file to do a relatively quick correspondence between records. However, in parallel-executed applications, the assignment of tasks to threads is rarely deterministic. Similarly, when a task executes on a given thread is equally nondeterministic. As a result, even in two runs of an application that were passed the exact same input, it is relatively hard to determine one-to-one correspondences between individual tasks.
Referring to
It is noted that the multi-core graphics processor 20 is merely an example of a multi-core processor, as the multi-core processor described herein may be a multi-core processor other than a graphics processor, such as a single instruction multiple data (SIMD) multi-core processor or a main system processor 40 of the system 10, as non-limiting examples.
In the context of this application, a task is any common unit of work for scheduling and execution. The task may be any portion of code with a beginning and an end, and the duration of the task may be defined as a number of cycles to execute the task. In general, the task is one type of object, which is analyzed by the system 10 in a profile that is generated for purposes of comparing the execution of the application at different times.
For example, the execution time or cache misses associated with a task in a first application session may be compared with the execution time or cache misses of the corresponding task in another application session. The system 10 may display the results of the differencing so that a programmer may, for example, use the results for purposes of analyzing execution of the particular application at different times or in different execution environments, assess software changes to a particular task, etc.
In accordance with some embodiments of the invention, a technique 120, which is depicted in
For the examples described below, it is assumed that the objects for the differencing analysis are tasks. However, techniques and systems that are disclosed herein may likewise be applied to other objects, in accordance with other embodiments of the invention.
Referring to
For example, a particular matching rule may dictate that task matches are evaluated based on task identifications (IDs). Although such a rule may be helpful in some instances, in other instances, task IDs may change between application sessions and thus may not be helpful for matching purposes. For example, in a particular application session, the ID for a particular task may be “object 46.” However, in a subsequent application session, the same task may be assigned an ID of “object 48.” As such, a matching rule, other than one that is based on a task ID match, may be used, in accordance with other embodiments of the invention.
As another example, the matching rule may specify that task matches are evaluated based on a particular task parameter, instead of on task ID. In this regard, a parameter is any kind of normal data type or structure that has a temporal or functional relationship to the task. As non-limiting examples, the parameter may be a buffer, a name-value pair or a string. In general, the parameter has a particular association to the task, and this association does not change between application sessions.
As depicted in
In this regard, in accordance with some embodiments of the invention, each unique task has a unique name and an associated matching rule that is described by the corresponding entry 100 in the matching rules dictionary 46. When a matching analysis is performed on a given task, the name of the given task is searched in the matching rules dictionary 46; and when a corresponding entry 100 is found, the IDs or parameters (depending on the matching rule) indicated by the entry 100 are used to identify a match. Thus, if the matching rule specifies that the match is to be evaluated based on task ID, then the IDs indicated by the entry 100 are examined to identify a match; and if the matching rule sets forth that the matching is to be based on a particular parameter, then the parameters indicated by the entry 100 are examined to identify a potential match.
Referring back to
In accordance with some embodiments of the invention, the system 10 may include a display 30 that may be coupled to the graphics processor 20 for purposes of displaying a particular graphical user interface (GUI) 34. In general, the GUI 34 may be used to display particular objects or tasks to the user such that the task may be selected (via an input/output (I/O) devices, such as a mouse 64 or keyboard 62 that are coupled to an I/O interface 60, for example) for purposes of selecting a particular object or task for analysis. Furthermore, the user may select a group of tasks that may be associated with a higher level software object, for example. And that result is that the user may select a group of tasks that the differencing analysis uses. Thus, the tasks that are selected are tracked between different application sessions for purposes of performing the differencing analysis.
The GUI 34 may also be used, in accordance with some embodiments of the invention, for purposes of displaying the results of the differencing analysis. For example, in accordance with some embodiments of the invention, the GUI 34 may display a histogram illustrating relative execution times, cache misses, etc., between different application sessions for the selected group of tasks.
It is noted that
Referring to
The tasks that are to be analyzed may be selected, for example, through user interaction with the GUI 30 (
The technique 150 includes, for each task, determining whether a corresponding matching rule (and thus, a corresponding entry 100) exists in the matching rules dictionary 46, pursuant to diamond 158. If not, then a corresponding entry 100 (see
Thus, for a task whose name corresponds to an existing entry 100, the technique 150 updates the entry 100 to further reflect the ID or parameter associated with the task; and for a task whose name does not correspond to an existing entry 100, the technique 150 creates a new entry 100 with the associated ID or parameter.
The technique 150 concludes by determining (block 168) whether another task is to be processed, and if so, control returns to diamond 158.
After the matching rules dictionary 46 has been updated, the differencing program may then begin a search of the tasks in the other application session, in a technique 180 that is depicted in
Other embodiments are contemplated and are within the scope of the appended claims. For example, in accordance with other embodiments of the invention, the matching rule may not be based solely on a single parameter or ID, but instead, the match may be based on a combinatorial rule. For example, a particular matching rule may set forth an ID and then further refine it with a parameter. That is, identifiers are unique within the hierarchy (based on some lineage) but not unilaterally unique. Other variations are contemplated and are within the scope of the appended claims.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art, having the benefit of this disclosure, will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
Number | Name | Date | Kind |
---|---|---|---|
6058265 | Bishop | May 2000 | A |
6539339 | Berry et al. | Mar 2003 | B1 |
7810083 | Chinya et al. | Oct 2010 | B2 |
20040111708 | Calder et al. | Jun 2004 | A1 |
20040154012 | Wang et al. | Aug 2004 | A1 |
20040205747 | Bernstein et al. | Oct 2004 | A1 |
20050177819 | Ober et al. | Aug 2005 | A1 |
20050183070 | Alexander et al. | Aug 2005 | A1 |
20050246691 | Hsieh et al. | Nov 2005 | A1 |
20060101421 | Bodden et al. | May 2006 | A1 |
20070157177 | Bouguet et al. | Jul 2007 | A1 |
20070271556 | Eggers et al. | Nov 2007 | A1 |
20080134205 | Bansal et al. | Jun 2008 | A1 |
20080244534 | Golender et al. | Oct 2008 | A1 |
20080270758 | Ozer et al. | Oct 2008 | A1 |
20100274972 | Babayan et al. | Oct 2010 | A1 |
Entry |
---|
Rong-Tai Liu et al., A Fast Pattern-Match Engine for Network Processor-Based Network Intrusion Detection System, 2004, [Retrieved on Oct. 10, 2013]. Retrieved from the internet: <URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1286432> 5 Pages (1-5). |
Florent Teichteil-Konigsbuch et al., A Multi-Thread Decisional Architecture for Real-Time Planning Under Uncetainty, 2007, [Retrieved on Oct. 10, 2013]. Retrieved from the internet: <URL: http://archive.cecs.anu.edu.au/satellite-events-icaps07/workshop3/paper17.pdf> 6 Pages (1-6). |
British Patent Office, Combined Search and Examination Report issued in corresponding GB Application No. GB1021591.1, dated Apr. 12, 2011, 9 pgs. |
Number | Date | Country | |
---|---|---|---|
20110154310 A1 | Jun 2011 | US |