A graphical application is built using many different and distinct visual elements. During development of the graphical application, the application may be analyzed to determine which elements consume a disproportionate amount of processing time. A tool that analyzes the performance of an application is commonly referred to as a performance profiler.
Traditional performance profilers provide a set of call graphs of function names to developers. The call graphs provide the total time spent in each function. Developers then perform a trial and error process to determine how best to modify the application so that it executes more efficiently. Thus, the current technique for profiling applications is not ideal.
An adequate performance profiler for graphical applications has eluded those skilled in the art, until now.
Embodiments of the invention are directed at a visual resource profiler. Generally stated, embodiments of the invention display a visual representation of performance data for a target application. The visual representation includes a visual indicator associated with a visual element of the target application. The visual indicator may graphically illustrate both a relative processing cost (i.e., percentage of the total cost) and/or an absolute processing cost (e.g., CPU time in milliseconds) for the associated visual element with respect to other visual elements in the target application. The visual resource profiler may also further break down the processing cost for the visual element into several software subsystems/services that contribute to the processing cost, such as animation, layout, rendering, and the like.
Many of the attendant advantages of the invention will become more readily appreciated as the same becomes better understood with reference to the following detailed description, when taken in conjunction with the accompanying drawings, briefly described here.
Embodiments of the invention will now be described in detail with reference to these Figures in which like numerals refer to like elements through-out.
Various embodiments are described more fully below with reference to the accompanying drawings, which form a part hereof, and which show specific exemplary implementations for practicing various embodiments. However, other embodiments may be implemented in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete. Embodiments may be practiced as methods, systems or devices. Accordingly, embodiments may take the form of a hardware implementation, an entirely software implementation, or an implementation combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
The logical operations of the various embodiments are implemented (1) as a sequence of computer implemented steps running on a computing system and/or (2) as interconnected machine modules within the computing system. The implementation is a matter of choice dependent on the performance requirements of the computing system implementing the embodiment. Accordingly, the logical operations making up the embodiments described herein are referred to alternatively as operations, steps or modules.
Generally stated, the described embodiments include mechanisms and techniques for displaying performance data for a graphical application in a visual manner. For example, an element in a display may be colored using different shades of a specified color to indicate a relative performance cost for that element in relation to other elements within the graphical application. Based on the visual performance data, developers can then readily modify the graphical application to reduce unnecessary processing costs.
Illustrative Systems
The principles and concepts will first be described with reference to sample systems that may implement certain embodiments of the invention. These sample systems may be implemented using conventional or special purpose computing equipment programmed in accordance with the teachings of these embodiments.
The visual profiler process 104 includes a profiling service 110, a visualization service 108, and a visualizer 112. In overview, profiling service 110, described in detail in conjunction with
During execution of the target application, the profiling service 110, shown in
Visual representation 300 depicts relative percentages of processing costs using various densities of dots. For example, most of the visual elements are illustrated having a dot pattern that is sparsely populated. However, visual element 304 (i.e., the text button) is illustrated with a more populated dot pattern. This graphically indicates that visual element 304 and visual elements 302, 308-312, 316, and 318 have different relative processing costs, but visual elements 302, 308-312, 316, and 318 have similar relative processing costs. Visual element 314 (i.e., a save button) is shaded with an even denser dot pattern which represents that visual element 314 consumes more time and resources than the other visual elements (e.g., visual elements 302-312, 318, and 318) in the display. For illustrative purposes, the dot pattern was used to indicate the relative processing costs between visual elements. The dot pattern may represent a color scheme having various hues of a color to indicate the relative processing costs. In another embodiment, the dot patterns may represent a depth for a three-dimensional bar graph that indicates the amount of processing consumed by the corresponding visual element. For this embodiment, visual element 314 may have a taller three-dimensional bar to indicate that it consumes more processing that the other visual elements.
In addition, user-interface 400 may represent each of the services as a bar on an element bar graph 410 using the same color-coding. An application bar graph 412 may be provided to illustrate the proportional use of the services for all of the profiled elements in the application. One will note that specific elements and/or software subsystems/services may be excluded from profiling if desired. When this occurs, the processing time allocated to the excluded elements and/or software subsystems/services may be omitted when calculating the percentages shown in the application bar graph 412.
User-interface 400 may also include a mechanism for controlling the profiler. For example, a checkbox 416 may be provided to specify whether the visual representation is displayed as an overlay over the display. An update interval box may be provided to specify how often to update the display. A button 420 may also be provided to turn profiling on or off.
As one can see, by having the performance data shown graphically along with the display being tested, as shown in
Additionally, device 500 may also have other features and functionality. For example, device 500 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in
Computing device 500 includes one or more communication connections 514 that allow computing device 500 to communicate with one or more computers and/or applications 513. Device 500 may also have input device(s) 512 such as keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 511such as a monitor, speakers, printer, PDA, mobile phone, and other types of digital display devices may also be included. These devices are well known in the art and need not be discussed at length here.
Illustrative Processes
The principles and concepts will now be described with reference to sample processes that may be implemented by a computing device in certain embodiments of the invention.
At block 604, while the target application executes, profiling process performs blocks 606-612. One will note that before target application executes, each service that the developer desires to include in process 600 has profiling code associated with the service. Some services may have profiling code already implemented within them. Other services may have profiling code injected within them in a manner known to those skilled in the art. The profiling code may utilize begin and end markers for interpreting specific function call sequences. This enables the profiler to identify the amount of resources spent in a higher-level service for which the function was called.
At block 606, whenever one of the services that have profiling code executes, the profiling code measures the processing time spent and posts that data to the profiling service for reception. The data may be in a form of an event which identifies a start time, an end time, and a category identifier. The start time and end time are associated with the begin and end markers, respectively. The category identifier identifies a service or subsystem associated with the event. If the event is generated for a specific visual element, the event may include an element identifier for the visual element. Occasionally, a circumstance may occur where one service invokes another service before exiting. For example, a layout service may invoke a rendering service before the layout service is finished. This causes the begin and end time for the rendering service to be completely encompassed within the begin and end time for the layout service. For this scenario, the rendering service is considered to be a child event to the layout service. Thus, the time associated with the rendering service is subtracted from the layout service when determining the time for that specific layout service event. The time calculated for a specific event, after any existing child events have been subtracted, are sent in events a from the target application to the profiler. They are then aggregated to determine the total time spent in the last N seconds. For example, if the profiler is configured to report time in the last ten seconds and 10 ms were sent in an event one second ago, 8.5 ms were sent in an event 2.5 seconds ago, and 20 ms were sent in an event 9 seconds ago, the profiler sums these individual times to determine a total time spent within an element over the last 10 seconds as 38.5 ms.
At block 608, the raw time information data may be optionally saved. As will be discussed below in
At block 610, the time information that is received is associated with one of the visual elements in the target application. Each visual element in the target application is given a unique integer identifier (ID). As will be described below, because the same ID is passed to both the profiling service and visualization service, the services can associate profiling data with the proper visual element. The time information may be correlated to the visual element based on an element identifier for the visual element.
At block 612, the new time information associated with one of the visual elements is updated, which then may trigger updates in the graphical display of the profiling data. The profiling process continues until the target application exits, until the profiler is de-activated, or until an error occurs which causes the application to stop processing.
At optional block 704, a new root mode may be injected into the tree of elements. By adding the new root node, three-dimensional transformations of the target application may be performed. It also allows the three-dimensional camera and view port to be changed. As will be described below, when the visualizer process is performed in the same process as the visualization service, changes in the profiling service causes a new visualization to be rendered on top of the current scene. This occurs because elements are included in a dirty region due to the visualization change. While this embodiment may work, the accuracy of the performance characteristics for the target application are severely altered. As will be explained below, in order to avoid this problem, an overlay of the visual representation may be copied onto the original scene using a copy mechanism. By positioning the copy outside the target scene's bounding rectangle, the target scene is not affected by the visualization. However, the scene is still rendered twice.
At block 706, visualization process performs blocks 708-714. At block 708, position and bounding box updates to a target application's visual tree are received. As one skilled in the art appreciates, a visual tree contains the visual elements used in a target application's user interface. The visual elements contain persisted drawing information. Therefore, the visual tree may be thought of as a scene graph that contains all the rendering information needed to compose the output to a display device. The visual tree contains the visual elements created by any means, such as by code, markup, template expansion, and the like. The rendering order for the visual elements is based on the visual element's position within the hierarchy of the visual tree. Typically, the order of traversal starts with a root visual, which is the top-most node in the visual tree. The root visual's children are then traversed, left to right. If a visual element has children, its children are traversed before the visual element's siblings. Thus, content of a child's visual elements is typically rendered in front of the visual element's own content.
At block 710, the positional data (e.g., position and bounding box) may optionally be saved as raw data. By saving at this point, the visualization service and profiling service may be agnostic to whether or not the data is real-time data or played back data. Playback is particularly useful for in-process visualizations such as three-dimensional representations of the performance data, since the high performance impact of the in-process visualization can be delayed to a later time. The visualization data and the performance data may also be recorded with a time-stamp alongside a terminal-service recorder. This would allow the target application's graphical interface to be played back along with the visualization data and performance data, making it possible to know what the target application was doing at the time the profiling data was recorded. During playback, the time stamped positional information and the time-stamped performance data may be consumed by the visualization and profiling services without knowing that the data was previously recorded.
At block 712, a tree of visual elements is created and updated based on the visual tree information that is received. The tree of visual elements matches the target application's tree of visual elements and contains a bounding box, transformation, element type, and identifier for the each visual element in the tree.
At block 714, changes to the tree of elements are updated. This may occur by sending the changes to the visualizer process' display.
At block 802, new time information is received. In one embodiment, new time information (i.e., performance data) may be received in real-time. In an alternate embodiment, the time information may be time-stamped data that was recorded earlier. The new time information may be piped from the profiling process via an inter-process communication channel.
At block 804, visualization data is received. Again, the visualization data may be received in real-time or may be time-stamped data that was recorded earlier.
At block 806, an inclusive time and an exclusive time are determined based on the new time information. As mentioned above, the inclusive time is based on the processing time for an element including its children and the exclusive time is processing time for the specific element only.
At block 808, a time is determined for each visual element. The time may be a proportional time (i.e., percentage) and/or an absolute time. The time may be further broken down into different services which were employed when performing processing for a corresponding visual element. The proportional time reflects the proportional time the different services performed processing in support of the target application as a whole, as illustrated by the application bar graph in
At block 810, the new time information and/or new visualization data is stored. The information may be stored in a hash table, a database, or the like. Each visual element in the target application is associated with a unique identifier. The unique identifier is then used to correlate the time information and visualization data for each visual element. This allows the profiler to associate profiling data with the proper visual element.
At block 812, performance data is graphically displayed based on the processing mode configured for the visualizer process. The performance data may illustrate absolute processing costs and/or relative processing costs for each visual element. In addition, both inclusive and exclusive processing times may be graphically displayed. As mentioned above, the visualizer process may be in-process, out-of-process, or playback. When the visualizer process is configured as in-process, another visual node may be inserted near the top of the visual tree. When changes occur in the profiling service, a new visual representation is rendered on top of the current scene for the target application. This implementation, however, requires the target scene to be re-rendered when updates are received by the visualizer process.
An alternative in-process configuration may overlay the visual representation on a copy of the original scene using a copy mechanism. If the copy is positioned outside the target scene's bounding box, the target scene would not be affected by the visual representation overlay. However, the scene would need to be rendered twice.
When the visualizer process is configured as an out-of-process, the visualizer process is in a separate process. The performance data and positional data are then provided to this separate process via an inter-process communication. The separate process then positions the semi-transparent visual representation on top of the display generated by the target process.
When the visualizer process is configured for playback, the visualizer process obtains the performance data and positional data from recorded data. By using the time-stamp on the recorded data, the processing for the visualizer process is the same as if the data is received in real-time.
One will note that the length of time that performance data is kept may be configurable. When the time passes, the visualizer may avoid displaying expired data if it exists.
The advantages of the invention are many. For example, by having the performance data shown graphically along with the display being tested, as shown in
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.