Computer runtime environments can host programs while the programs are running. For example, a runtime environment may be a virtual environment running in a host computer system. Runtime environments can include multiple subsystems having different mechanisms for performing a variety of tasks.
Profilers are computing tools that are used to collect performance information such as resource cost data about running computer programs and present the collected information to a user. This is typically done with a profiler component that runs at the same time as a computer program to collect resource cost data for the computer program. The profiler can present such collected information to a developer to provide the developer with information about the performance of the running program.
The tools and techniques described herein can include correlating resource cost data from different subsystems of an environment where a program is running, and attributing the cost data to programming elements (i.e., components of the computer program that can be modified by user input from a program developer, such as scripts, source code, etc.) of that program. The tools and techniques may include constructing a correlation data structure that correlates units of the resource cost data from different subsystems and attributes resource cost data to the programming elements, and/or using such a data structure to analyze a dataset derived from the resource cost data.
As used herein, different runtime environment subsystems (sometimes referred to herein generally as different subsystems) are different types of subsystems of a runtime environment, which is an environment in which a program is run. For example, different subsystems could include a graphics subsystem, a user code execution subsystem, a media decoding subsystem, a networking subsystem, and an overall programming environment (an operating system abstraction layer that manages a lifetime of a running program). There can also be other different subsystems. For example, an operating system itself could be another example of a different subsystem if the scope of a system being profiled were to include the operating system. As another example, resources may be considered subsystems. For example, a GPU (graphics processing unit) may be considered as a subsystem, and a CPU (central processing unit) may be considered as another subsystem.
In one embodiment, the tools and techniques can include running a program in a computer system. The program can include declarative programming elements corresponding to elements of an actual state data structure that is maintained while the program is running. Data can be collected while the program is running. The collected data can include resource cost data from multiple different runtime subsystems in the computer system. A model state data structure can be constructed from the collected data. The model state data structure can represent a data structure that could have produced the resource cost data. A correlation data structure can be generated using the model state data structure. The correlation data structure can correlate the resource cost data from the different runtime subsystems and can attribute units of the resource cost data to the programming elements.
In another embodiment of the tools and techniques, a computer program can run in a computer system. The program can include programming elements corresponding to elements of an actual state data structure that is maintained while the program is running. Data that includes resource cost data can be collected from multiple different runtime subsystems in the computer system while the program is running. An indication to analyze at least a portion of the resource cost data can be received. At least a portion of the cost data can be analyzed with different analyzers using a correlation data structure that correlates the resource cost data from the different runtime subsystems and attributes units of the resource cost data to the programming elements. Additionally, analysis results from the different analyzers for the resource cost data can be composed together.
This Summary is provided to introduce a selection of concepts in a simplified form. The concepts are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Similarly, the invention is not limited to implementations that address the particular techniques, tools, environments, disadvantages, or advantages discussed in the Background, the Detailed Description, or the attached drawings.
Embodiments described herein are directed to techniques and tools for improved analysis of the resource cost data and/or correlation of resource cost data and attributing resource cost data to programming elements. Such improvements may result from the use of various techniques and tools separately or in combination.
Such techniques and tools may include measuring performance metrics and measures to relate resource use cost to individual programming elements across different subsystems. Wall clock time can be used to calculate when individual workloads executed and when the associated costs were incurred. Resource time (e.g., CPU or GPU execution time) or other metrics such as memory usage may be used to quantify and represent the associated resource costs. Resource cost data from different subsystems may be correlated and units of the data may be attributed to individual declarative and/or imperative programming elements based on an execution trace and on the design of a runtime module that is running the program during execution. Information from the different subsystems can be used to correlate resource cost data from different subsystems and attribute the cost data to programming elements. Cost data metrics may or may not correlate to programming elements on a one-to-one basis. For example, for imperative constructs, mapping between collected metrics and programming elements may be a one-to-one mapping. For declarative constructs, mapping between collected metrics and programming elements may include multiple metrics mapping to one declarative programming element.
Accordingly, one or more benefits can be realized from the tools and techniques described herein. For example, information on resource costs may be presented to a user such as a developer in meaningful ways, allowing a developer to be informed as to how costs are distributed, and possibly how programming elements can be modified to improve performance of the program. As an example, a developer may be able to modify a program to perform some functions on a GPU, rather than in a CPU, if that change would decrease the resource costs (e.g., decreasing execution time), and thereby improve overall performance of the program.
The subject matter defined in the appended claims is not necessarily limited to the benefits described herein. A particular implementation of the invention may provide all, some, or none of the benefits described herein. Although operations for the various techniques are described herein in a particular, sequential order for the sake of presentation, it should be understood that this manner of description encompasses rearrangements in the order of operations, unless a particular ordering is required. For example, operations described sequentially may in some cases be rearranged or performed concurrently. Moreover, for the sake of simplicity, flowcharts may not show the various ways in which particular techniques can be used in conjunction with other techniques.
Techniques described herein may be used with one or more of the systems described herein and/or with one or more other systems. For example, the various procedures described herein may be implemented with hardware or software, or a combination of both. For example, dedicated hardware implementations, such as application specific integrated circuits, programmable logic arrays and other hardware devices, can be constructed to implement at least a portion of one or more of the techniques described herein. Applications that may include the apparatus and systems of various embodiments can broadly include a variety of electronic and computer systems. Techniques may be implemented using two or more specific interconnected hardware modules or devices with related control and data signals that can be communicated between and through the modules, or as portions of an application-specific integrated circuit. Additionally, the techniques described herein may be implemented by software programs executable by a computer system. As an example, implementations can include distributed processing, component/object distributed processing, and parallel processing. Moreover, virtual computer system processing can be constructed to implement one or more of the techniques or functionality, as described herein.
The computing environment (100) is not intended to suggest any limitation as to scope of use or functionality of the invention, as the present invention may be implemented in diverse general-purpose or special-purpose computing environments.
With reference to
Although the various blocks of
A computing environment (100) may have additional features. In
The storage (140) may be removable or non-removable, and may include computer-readable storage media such as magnetic disks, magnetic tapes or cassettes, CD-ROMs, CD-RWs, DVDs, or any other medium which can be used to store information and which can be accessed within the computing environment (100). The storage (140) stores instructions for the software (180).
The input device(s) (150) may be a touch input device such as a keyboard, mouse, pen, or trackball; a voice input device; a scanning device; a network adapter; a CD/DVD reader; or another device that provides input to the computing environment (100). The output device(s) (160) may be a display, printer, speaker, CD/DVD-writer, network adapter, or another device that provides output from the computing environment (100).
The communication connection(s) (170) enable communication over a communication medium to another computing entity. Thus, the computing environment (100) may operate in a networked environment using logical connections to one or more remote computing devices, such as a personal computer, a server, a router, a network PC, a peer device or another common network node. The communication medium conveys information such as data or computer-executable instructions or requests in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, infrared, acoustic, or other carrier.
The tools and techniques can be described in the general context of computer-readable media, which may be storage media or communication media. Computer-readable storage media are any available storage media that can be accessed within a computing environment, but the term computer-readable storage media does not refer to propagated signals per se. By way of example, and not limitation, with the computing environment (100), computer-readable storage media include memory (120), storage (140), and combinations of the above.
The tools and techniques can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computing environment on a target real or virtual processor. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various embodiments. Computer-executable instructions for program modules may be executed within a local or distributed computing environment. In a distributed computing environment, program modules may be located in both local and remote computer storage media.
For the sake of presentation, the detailed description uses terms like “determine,” “choose,” “adjust,” and “operate” to describe computer operations in a computing environment. These and other similar terms are high-level abstractions for operations performed by a computer, and should not be confused with acts performed by a human being, unless performance of an act by a human being (such as a “user”) is explicitly noted. The actual computer operations corresponding to these terms vary depending on the implementation.
One or more actual state data structures (224) can be defined by the program (220) and can be maintained by the runtime environment (210) when the program (220) is running. For example, a state data structure (224) may be a tree structure, such as a tree that defines a user interface in a declarative manner. The state data structure(s) (224) are illustrated within the program (220) in
As the program (220) is run in the runtime environment (210), multiple different subsystems (230) in the runtime environment (210) can be involved in supporting and running the program (220). For example, the subsystems (230) may include a graphics subsystem (232), a user code execution subsystem (234), a media decoding subsystem (236), a networking subsystem (238), an overall programming environment subsystem (240), and an operating system (242). Alternatively, the operating system (242) may not be considered to be one of the subsystems (230) to be profiled in some situations, such as where the analysis is to be focused on subsystems other than the operating system (242).
A profiler probe (250) running in the runtime environment (210) can collect data including cost data (260) as the program (220) runs. This cost data (260) can be integrated cost data from the different subsystems (230). The profiler probe (250) can communicate the cost data (260) to the profiler tool (202), which may run outside the runtime environment (210), as illustrated. Alternatively, the profiler tool (202) may run inside the runtime environment (210), and may also act as the profiler probe (250). As an example, the cost data (260) may be collected in one or more trace log files.
The profiler may use the collected data including the cost data (260) to construct one or more model state data structure(s) (270) that can be used to map items from the cost data (260) to the actual state data structure(s) (224), which can include or be mapped to the programming elements (222) of the program (220). A model state data structure (270) can match a corresponding actual state data structure (224). For example, if an actual state data structure (224) is a tree data structure, the matching model state data structure (270) may also be a tree data structure. Other data structures could be used as well. For example, a list of nodes created during execution time can be maintained, and a separate structure can track the parent of a given node at a given time. Accordingly, this structure can represent a list of nodes with each node having information on the parent and the time for which the parent node was active. When a tree structure is to be created for a given time, a scan can be performed through each node in the list to identify the parent of the current node at that time. The current node can be added to the tree structure as a child of the identified parent node.
As an example of constructing a model state data structure (270), the cost data (260) may be included in an execution trace log file generated while running the program (220). The cost data (260) and the other collected data in that trace file can come from multiple different subsystems (230). The profiler probe (250) may insert markers in the code of the program (220), so that such markers trigger events that are recorded in the trace log file. The data for each such event may include a system clock time (wall clock time) for the event. For example, a log entry for an execution time on a processor may include start and/or stop system wall clock times. For resource usage that is measured in units of time, a log entry may also include a resource time for the particular trace. For example, the entry may indicate an amount of execution time on a GPU or CPU. Log entries may also include other information, such as the amount of memory allocated for a particular element corresponding to an element of a state data structure (224), a frame rate at a particular wall clock time, etc. Because the log file can include such resource usage entries from all the different subsystems (230) that are to be included in the profile, those usages can be combined to reveal the overall resource usage by imperative components being profiled in the runtime environment (210).
The cost data (260) may also include some unwanted data, such as data from resource usage by the profiler probe (250) as the probe (250) collects the cost data (260). Such unwanted cost data may be identified and subtracted out by the profiler probe (250) and/or the profiler tool (202). This may be done by actually measuring costs contributed by the profiler probe (250). However, measuring the costs of the profiler probe (250) may introduce even more unwanted costs. Accordingly, the unwanted costs may be quantified in some other manner, such as by estimating costs contributed by the profiler probe (250). For example, estimates can be based on how many cost samples were taken by the profiler probe (250) and/or based on some other information. Rather than actually subtracting out the costs, information about unwanted costs may be presented along with representations of the cost data (260) itself. For example, information on tolerances for the cost data (260) may be presented.
As noted above, the cost data (260) can reveal the costs of the imperative elements from the runtime environment (210). However, at least some of those imperative elements may not be actual programming elements (222) of the program (220). For example, where the programming elements (222) are declarative programming elements, the subsystems (230) of the runtime environment (210) may execute those declarative programming elements by invoking imperative elements of the runtime environment (210) that specify how tasks identified by the declarative programming are to be executed. Accordingly, the profiler tool (202) may include tools that that correspond to designs in the runtime environment (210), and those tools can identify conditions that would have been caused by the particular runtime environment (210) executing particular types of declarative programming elements from the program (220). For example, declarative languages are often exhibited by corresponding engines. Using a knowledge of how an engine works and of a profile of time taken by particular components in the runtime environment (210) when running the program (220), a mapping can be generated of what declarative elements would have caused the time and component profile. Accordingly, tools may be used to construct the model state data structure (270), which can represent a state data structure (270) that could have produced the collected data including the cost data (260).
The model state data structure (270) may include elements that correspond to different types of declarative programming elements, and may also include some elements that correspond to types of imperative programming elements. Accordingly, the model state data structure(s) (270) can match the actual state data structure(s) (224) that were present when the program (220) was running in the runtime environment (210). Thus, elements of the model state data structures (270), which are already matched to items in the cost data (260) can be matched to structures in the program (220) that include the programming elements (222).
Accordingly, the profiler tool (202) can use the model state data structure(s) (270) to match items in the cost data (260) to the programming elements (222) in the program (220), even when those programming elements (222) are declarative elements. That matching can be used to construct the correlation data structure (280), which can correlate the cost data (260) for different subsystems (230) and can also attribute items in the cost data (260) to the matching programming elements (222). The cost data (260) for different subsystems may be correlated using timing information from the trace log file, which can also be included in the correlation data structure (280). For example, items of the cost data (260) identifying memory usage by different subsystems (230) at a particular wall clock time may be combined and summed to reveal an overall memory usage for that time. The correlation data structure (280) may include one or more tables, such as a table for each type of cost being profiled (e.g., one table for memory usage, one table for GPU usage, one table for CPU usage, etc.). For example, referring to
As noted above, the correlation data structure (280) can correlate cost data from different subsystems (230). For example, consider a declarative programming element (222) that indicates a text box. The user code execution subsystem (234), the graphics subsystem (232), and the GPU may each have memory allocated for the text box. Accordingly, to calculate how much memory the text box programming element is using, information from the different subsystems can be combined. The memory usage by different subsystems can be tracked by tracking what objects are allocated by what other objects. For example, object A may be allocated by the user code, object A may allocate object B, object B may allocate object C, etc. And these objects may be allocated in different subsystems (230). The tracking of this allocation can be maintained down to the processor level (e.g., including memory allocated in the GPU). These allocations from different subsystems can then be correlated to the object that was allocated by the programming element (222) in the memory table. Also, the memory table may include an indication of a subsystem (e.g., user code execution subsystem (234), graphics subsystem (232), GPU subsystem, etc.) from which an item of memory allocation was derived.
The correlation data structure (280) can include multiple different sub-structures, such as multiple different tables that can be constructed from trace file information using knowledge of the design of the runtime environment (210). For example, one table may include a graph that maps between different objects from the runtime environment (210). For example, that graph can identify relationships such as parent-child relationships between the objects. That graph may be used in constructing the model state data structure(s) (270). Other tables correlate items of resource usage by time and may attribute each usage item to a programming element that was responsible for the usage. For example, tables in the correlation data structure (280) may include a table that maps memory usage to objects, a table that maps GPU usage to objects, a table that maps CPU usage to objects, etc.
The correlation data structure (280) can be analyzed to provide information related to the costs attributed to the programming elements (222). For example, the profiler tool (202) may apply a series of filters and/or analyzers to the data in the correlation data structure (280). This filtering and/or analysis may be done in response to user input.
The analyzers and filters may be applied in various ways. One example of an analyzer flow (400) is illustrated in
The first dataset context (410) can be provided to a first analyzer (420), which may be a component of the profiler tool (202) of
Referring to
The analyzers can perform various different types of analyses on the cost data. For example, one analyzer may receive data within a specified time period and determine what elements are being modified during that time. Another analyzer may receive elements that have been modified, and that analyzer may yield the storyboards that were causing the elements to be modified. Another analyzer may sort data to see which has the greatest resource cost (most execution time, most memory, etc.). A declarative model may be used by the profiler tool (202) to indicate what analyzers are applicable in specified situations. For example, available analyzers may be limited to different aspects depending what type of resource cost or usage is being analyzed and what associated sub-structure of the correlation data structure (280) will be used (frame rate table, memory usage table, CPU usage table, GPU usage table, etc.).
Several resource cost data correlation techniques will now be discussed. Each of these techniques can be performed in a computing environment. For example, each technique may be performed in a computer system that includes at least one processor and at least one memory including instructions stored thereon that when executed by the at least one processor cause the at least one processor to perform the technique (one or more memories store instructions (e.g., object code), and when the processor(s) execute(s) those instructions, the processor(s) perform(s) the technique). Similarly, one or more computer-readable storage media may have computer-executable instructions embodied thereon that, when executed by at least one processor, cause the at least one processor to perform the technique.
Referring to
The technique of
Also, the analysis results can include one or more suggestions for modifying one or more of the programming elements. Such suggestions may be derived from patterns or conditions that can be identified by one or more of the analyzers. For example, the suggestions could be for different settings, different approaches to the code in the programming elements, etc. As one example, in some situations, performance may be improved if a user interface element is cached as a bitmap, rather than reconstructing the element each time it is to be rendered. In other situations, such bitmap caching may hinder performance, such as where the user interface element is not used frequently or where the user interface element is frequently changed. Accordingly, an analyzer may be configured to determine whether it would be useful to cache the bitmap, determine whether the bitmap is already being cached, and suggest a corresponding modification to the code for the user interface element if it appears that such a modification would improve performance.
The correlation data structure may include one or more tables. The actual state data structure and the model state data structure may be a tree structures.
Running (510) the program can include running the program in a runtime environment having a runtime module that processes the declarative programming elements according to one or more imperative techniques. Constructing (530) the model state data structure form the collected data can include invoking one or more reconstruction techniques corresponding to at least one of the one or more imperative techniques.
Referring to
The technique of
Referring to
Referring still to
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
This application claims priority to U.S. Provisional Patent Application No. 61/474,460, entitled RESOURCE COST CORRELATION ACROSS DIFFERENT SUBSYSTEMS, filed Apr. 12, 2011, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61474460 | Apr 2011 | US |