Profiling enables examination of program behavior to focus performance tuning. A profiler is an automated tool that produces a profile of a program from information collected during program execution. A profile captures behavioral characteristics of a program. One or more portions of program can be identified from a profile as candidates for optimization. For example, a profile might indicate that an excessive amount of time is spent executing a particular function. In other words, a profile aids understanding of program behavior to allow concentration of optimization efforts. Profilers are often classified based on their methods of gathering data, among other things.
There are two distinct types of approaches to gathering profile data, namely instrumentation and sampling. In the instrumentation approach, code is added to a program to collect information during execution. Here, the added code is an instrument that measures program behavior as the program executes. For example, the frequency and duration of function calls can be measured. In the sampling approach, an executing program is halted periodically using operating system functionality and sampled to determine the current state of execution. Accordingly, it can be noticed that twenty-percent of the time the program is executing a specific code point. The sampling approach thus provides a statistical approximation rather than exact data.
Profile data is often presented as a call tree (a.k.a., call graph) that breaks down program execution. For example, a call tree can show function execution paths that were traversed in a program. The root node of the call tree can point to the entry point into the program and each other node in the tree can identify a called function as well as performance data such as execution time of the called function. The edges between nodes can represent function calls, and cycles can be indicative of recursive calls. The call tree can be analyzed by a developer to identify program hotspots, such as functions that occupy a large portion of execution time, among other things.
The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed subject matter. This summary is not an extensive overview. It is not intended to identify key/critical elements or to delineate the scope of the claimed subject matter. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
Briefly described, the subject disclosure pertains to visualization of profile data. A set of one or more visualizations can be rendered to facilitate program performance analysis utilizing either real time or historical profile data. In accordance with one aspect, a plurality of correlated visualizations can be presented that provide different types of views of profile data. In accordance with another aspect, a set of visualizations can operate with respect to logically grouped profile data to enable meaningful analysis of program execution. Here, profile data can be ascribed to groups that convey information about high-level semantic function of a program or sub-systems, among other things, based on an organizational scheme, for example. Mechanisms are also provided to enable recording and playing back profile data as well as controlling the granularity or scope thereof. Further, visualizations can provide feedback based on designated performance goals.
To the accomplishment of the foregoing and related ends, certain illustrative aspects of the claimed subject matter are described herein in connection with the following description and the annexed drawings. These aspects are indicative of various ways in which the subject matter may be practiced, all of which are intended to be within the scope of the claimed subject matter. Other advantages and novel features may become apparent from the following detailed description when considered in conjunction with the drawings.
Visualization of profile data conventionally suffers from several problems. First, data is expressed in a manner that is too granular and abstract. That is, data is rendered at a time scale that cannot be associated with user action and/or meaningful program execution semantics (e.g., refresh window). Navigating a complete profile session rendered as a timeline is therefore difficult or unproductive toward identifying performance issues. Second, profilers traditionally operate against historical information. By way of example, a program can be started, collection is enabled, the program is exercised, collection is halted, and the captured data is subsequently analyzed. While there are some performance tools that can provide real time application monitoring, these tools are limited as well since they cannot be started and stopped in flexible ways, various counter are not correlated with one another, and there is limited sophistication in terms of analysis and visualization provided.
Details below are generally directed toward visualization of profile data in various ways to facilitate program performance analysis. In accordance with one aspect, a plurality of correlated visualizations can be presented that provide different types of views of profile data. In accordance with one particular embodiment, at least a portion of the profile data can be logically grouped profile data to enable meaningful analysis of program execution. Here, profile data can be ascribed to groups that convey information about high-level semantic function of a program or sub-systems, among other things, based on an organizational scheme, for example. Visualizations can also reflect the state of profile data with respect to designated performance goals, and mechanisms are provided to enable recording and playback of profile data as well as controlling the scope of profile data. In one instance, the set of visualizations can be presented simultaneously with program execution. Alternatively, the set of visualization can operate over historical data.
Various aspects of the subject disclosure are now described in more detail with reference to the annexed drawings, wherein like numerals refer to like or corresponding elements throughout. It should be understood, however, that the drawings and detailed description relating thereto are not intended to limit the claimed subject matter to the particular form disclosed. Rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the claimed subject matter.
Referring initially to
The visualization system 100 includes visualization engine 110, data store 120, collection component 130, retrieval component 140, and condition component 150. The visualization engine 110 (also a component as defined herein) is configured to render a set of one or more visualizations. Herein, rendering refers to converting coded content to a format for display, or generating content in a display format, which can subsequently be presented on a physical display (e.g., LCD, touch screen . . . ) connected to a computer or other processor-based device. The visualization engine supports various types of visualizations, or visual content, including but not limited to numerous graphs (e.g., line, bar, pie . . . ), images, and textual content, among other things. Upon receipt or acquisition of input, the visualization engine 110 can render one or more visualizations populated with profile data. In accordance with one embodiment, the visualization engine 110 is configured to operate in real-time with respect to stream of profile data associated with an executing program. Alternatively, the visualization engine 110 can operate with respect to historical profile data (e.g., persisted log file). In any event, note that the visualization need not be static but rather can be animated in response to acquired profile data. In other words, visualizations can be updated as profile data is acquired.
The data store 120 is a computer readable/accessible medium that is configured to store arbitrary and potentially copious amounts of data. As examples, the data store 120 can be embodied as a log file, a database, and/or an in-memory representation of data. In an event, the data store 120 can save substantially any data to facilitate profile visualization. Furthermore, while illustrated within the visualization system 100 the data store 120 can reside outside the visualization system 100 as opposed to inside the visualization system 100 or multiple data stores can be provisioned inside and outside the visualization system 100.
The collection component 130 is configured to accumulate and record, or save, data, for instance to the data store 120. For example, the collection component 130 can save a stream of profile data and optionally user input. In accordance with one particular embodiment, the collection component 130 can acquire and save one or more screenshot images of an executing program being exercised or other visualizations. User interactions with respect to the one or more screenshot images such as mouse movements, clicks, and text entry, can be saved as well.
The retrieval component 140 is configured to retrieve data from the data store 120 and provide the data to the visualization engine 110. In accordance with one embodiment, the retrieval component 140 can simply push data to the visualization engine. Alternatively, the retrieval component 140 can provided data in a response to a parameterized request from the visualization engine 110. Further, the retrieval component 140 is configured to perform various processing (e.g., query processing) of the data acquired from the data store prior to returning processed data to the visualization engine 110 for rendering. For example, a particular subset of profile data can be returned that satisfies a request. The retrieval component 140 can also be configured to save data, for example to the data store 120. For instance, the retrieval component 140 can cache processed results to the data store 120 to provide an efficient response to requests that are parameterized in the same way. Accordingly, various caching techniques known in the art can be employed.
In one particular embodiment, the retrieval component 140 can enable playback of historical profile data. More particularly, the retrieval component 140 can be configured to enable replay to be started, stopped, paused, moved back to a previous point in time, moved forward to a later point in time, slowed down, or sped up. For instance, a user can instruct the system to move forward to a particular point in time of interest and then pause the playback to further investigate the data. Furthermore, if execution is delineated into segments movement can be made from a first segment to a second segment.
The condition component 150 is configured to enable specification of significant runtime conditions, or, in other words, performance goals. A user can specify hard and/or soft conditions such as those relating to throughput (e.g., desired frame rate) of a program utilizing functionality provided by the condition component 150. Such conditions can be saved to the data store 120 directly or by way of collection component 130, and subsequently acquired and employed by the visualization engine 110. The visualization engine 110 can present these conditions visually as a form of user feedback. As examples, event markers can be applied to timelines, a particular color can be applied to a graphic, optionally flashing, or other dynamic indicators of condition failure or satisfaction can be presented. In accordance with one embodiment, the visualization engine 110 can present readable text describing met or unmet conditions in a message window.
Turning to
The data graphic component 220 is configured to generate visualizations of profile data, for example in an integrated environment. For example, the visualization can include various types of graphs such as pie, line, or bar graphs. However, any visual mechanism for that presents data in a useful manner can be employed. By way of example, and not limitation, one or more screenshot images of an executing program can be provided by the data graphic component 220.
The coordination component 230 is configured to coordinate, or correlated data across visualizations of profile data. In accordance with one aspect, numerous visualizations can be rendered simultaneously to facilitate analysis of program execution. The coordination component 230 enables the visualization to be coordinated in some manner, for example based on time (e.g. timeline). In this manner, all visualizations will render the profile data in concert. By way of example, if a user initiates zooming in on a particular segment of data in one visualization (e.g., line graph), other active visualizations (e.g., bar graph, pie graph . . . ) can be updated to reflect the same segment of data in a given period of time.
What follows is a series of exemplary screenshots to aid clarity and understanding with respect to functionality enabled by the visualization system 100. The screenshots are only one possible arrangement of graphical elements. Other types of graphics and arrangements thereof are possible. Accordingly, the screenshots are not intended to limit the scope of the appended claims but rather to provide possible representations to facilitate further description and understanding of the functionality afforded by the visualization system 100. Further, although the screenshots pertain to profiling with respect to a web browsers and web applications executing within the web browser, the claimed subject matter is not limited thereto. Additionally, although discussion focuses on visualizations, other mechanisms can also be utilized, such as audio, to further enrich a user's experience in analyzing profile data.
Area 310 is designated for target control. In other words, a particular subset or superset of profile data can be selected for display by way of a selection signal (e.g., drag and drop, click, voice command . . . ). Here, three web browser processes are displayed as thumbnails, which are selectable. The larger size of the first web browser thumbnail relative to the others indicates that it was selected and thus profile data corresponds to that particular process. Besides being a utility for collecting real time information with respect to executing programs, this target control can be used to navigate/select from persisted logs. Additionally, target control is not limited to processes/other targets that are running on a local machine. Remote instances or web applications running on attached devices can also be rendered in area 310.
Area 320 illustrates a pool of available visualizations that can be deployed against profiling data. In accordance with one embodiment, visualizations can be dragged from area 320 into another area in order to render that visualization against current profiling data. In other words, a visualization can be applied based on a selection signal (e.g., drag and drop, click, voice command . . . ) identifying the visualization. As will be described further hereinafter, it is possible to organize and overlay visualization in many ways.
Area 330 is the holder area for visualizations that comprise non-timeline based visualizations of profile data. That is, the visualizations include rendering that is not overtly associated with a timeline. Here, pie chart 332 shows a percentage breakdown of time spent between several sub-systems during the last frame rendering of an application. By way of example, the pie chart 332 can denote central processing unit utilization for an executing web application running in a browser in which the white portion indicates idle time and the black portion indicates actual processor utilization. Further, a time correlated screenshot image 334 is displayed to the left of the pie chart 332. The screenshot image 334 is a view of a target web application at a specific point in time. This view is useful in order to understand the state of the webpage and/or most recent user action associated with a specific sequence of profiling data. Note that this visualization could itself be overlaid onto the thumbnail images in target area 310. As well in some embodiments such as real time collection scenarios, this thumbnail could serve as an event surrogate for the target process. That is, the thumbnail could serve as a sink for events/other messages, which are subsequently forwarded to the actual target application. For example, user input, such as mouse moves, clicks, and text entry, can be overlaid on the screenshot image 334.
Area 340 shows a holder area for time-line-based visualizations. This area can include an arbitrary number of “swim lanes” that permit side-by-side display. Here, two visualization are illustrated namely line graph 342 and line and bar graph 344. The line and bar graph 344 illustrates the ability to overlay visualizations on top of each other. Further, the line graph 342 and line and bar graph 344 depict rendering of visualizations according to a common notion of scope or selected time, among other things. Other visualizations such as the pie chart 332 and the screenshot image 334 can also synchronize their renderings to such scope or selected time. A user can select (via a selection signal) a contiguous sequence of time that is of interest with respect to visualizations in area 340. In this case, region 346 has been selected. During or after this selection process, other visualizations can be updated according to the profile data that is selected. For example, visualizations in area 330 can alter their appearance in response to the selection.
A set of controls 350 are shown that facilitate recording and playback of data. In addition to features such as start recording, pause playback, and jump to next, among others, functionality can be implementation specific. Jump to next, for example, can be associated with several meaningful sequence points such as the next document object model (DOM) event, the next code generated sequence point, the next rendered frame, or the next failed diagnostic message, among others. Slider 360 is a control that allows control of speed of playback. In other embodiments, a similar slider can be used to control zoom in and zoom out of profiling data and/or control of the size of a recording buffer for profiling data (e.g., record last twenty-five seconds of profiling data).
Area 370 is s a message display area for rendering analysis/results that are best displayed in a scrollable form such as a list of diagnostic messages and/or a diagnostic synopsis of profile data by function name, among others. These entries can be synchronized with selection of other visualization and can themselves by used as a selection mechanisms. For instance, if a user selects a time sequence on a timeline, the diagnostic messages displayed in area 370 can be constrained to the selected timeframe. Conversely, a user might double-click a diagnostic message displayed in area 370 in which case the selection can be updated to the point in time at which that error was raised. All other visualizations could be updated to correspond to that point in time as well.
Visualization 420 shows a central processing unit utilization visualization generated by logical grouping. This view might make it immediately apparent, for example, that an inordinate period of time is spent rending for a specific user-interface update frame.
Visualization 430 shows a nested view of execution. Here, a hierarchy of logical sub-systems is displayed. A general request to re-render a page, for instance, might result in a series of nested execution, for example to process CSS rules, to layout elements in a page, or to render each constituent element.
Visualization 440 distills a semantic operation, into an easy-to-understand visualization that that is correlated to other selection controls/visualizations. Here, time spent rendering, or painting, a page is shown such that the period and length of time of such an operation noted by the position and size of polygons with respect to a timeline.
Visualization 450 presents information that is best understood as text, content render as a scrollable list, or other form, but which remains correlated to other views/profile range selectors. In this case, visualization 450 is a profile report (e.g., a call count and report of time spent executing various functions) presented for the current range of profile data.
Visualization 920 elaborates on visualization 910 by demonstrating an alteration in visual appearance (e.g., diagonal hatching, change in color, flashing . . . ) to indicate that some diagnostic standard, or performance goal, has not been met. For example, perhaps the frame rate has fallen below an acceptable threshold due to an inordinate period of time being spent in the display subsystem. In this way, the visualization provides a useful ad hoc diagnostic mechanism for users collecting information in real time or playing back a collected log. A user following up on a report of poor web page performance, for example, might connect this visualization to a running instance of the page and begin to interact with it. On making a gesture, such as hovering over a specific element of a program that compromises frame rate throughput, the visualization can provide clear feedback that performance is compromised.
Note also that a visualization can be used to annotate a general event stream with events or notifications. That is, a visualization is not only a source of interesting animations/other user interface data, but a visualization itself could be responsible for processing/analyzing profiling data in order to provide diagnostics information, or a summary of operation, among other things. A broader system includes a general mechanism for folding per-visualization events/other data into the general event stream, global timeline controllers, common error reporting areas, etc. Data associated with a specific visualization can be identifiable. This permits, for example, a user to “drag off” a specific visualization, with the result that all its events, diagnostics messages, and other artifacts associated with it would also go away.
Further, although not illustrated the visualization system 100 can support a differencing mechanism. By way of example, consider two log files including historical data. Here, two screen images can be presented corresponding to the two log files with correlated profile data. Further, the profile data can be marked with identical event markers (e.g., identify a page startup, identify set of common user interface operations . . . ) to facilitate comparison.
Turning attention to
The data collection component 1210 is configured to acquire profile data regarding program 1212. Profile data can be any arbitrary collected and/or computed data associated with execution of a program. In one instance, such profile data can correspond to a time to execute with respect to a particular function/operation. In another instance, the profile data can correspond to non-time-to execute data including occurrence or frequency of one or more events and optionally a payload associated with the occurrence of one or more events, where an event is a signal that an executing program has hit a specific point in code and a payload is data associated with hitting the specific point in code. By way of example, non-time-to-execute profile data can include bytes or objects allocated, page refreshes, registry reads/writes, worker threads spawned, or associated binary, among other things. The program (a.k.a. computer program) 1212 comprises a set of instructions specified in a computer programming language that when executed by a processor performs actions prescribed by the set of instructions. The data collection component 1210 can gather profile data from an executing program utilizing a variety of techniques including instrumentation and sampling. Work performed by the program 1212 can by captured by event/execution probes (e.g., event tracing events or callbacks from instrumented code) or stack samples collected by an operating system, for example. Data can be collected that enables analysis in terms of time-to-execute (e.g., central processing unit utilization or literal cycles spent in active execution), heap allocations, or other arbitrary information that can be expressed in custom payloads of generated events.
A combination of instrumentation and sampling can be utilized by the data collection component 1210. Data collected by each approach can augment the other approach. For example, data can be combined/merged, averaged, crosschecked, and/or statistically normalized, among other things. In this manner, advantages of both approaches can be exploited and disadvantages can be offset. As an example, both approaches can result in an observer effect, which means the act of observing a program itself can influence the program. Sampling typically does not result in a significant observer effect, but results in less precise data than that gathered by instrumentation. Accordingly, sampling data can be supplemented with instrumentation data, where appropriate. Additional mitigation against the observer effect can also be provided by creating profiling buckets that overtly group (and therefore exclude from other data) profiling data associated with operating system code that executes during production of events from instrumented code and stacks from code sampling. Further, a lightweight instrumentation approach can also be enabled where a stack is received with respect to instrumented probes/events. Suppose a single event is enabled solely for bytes allocated (which results in very little observer effect). If the “bytes allocated” event is associated with a stack, the bytes allocated data can be grouped according to stack-specified bucketing. Note, in this example, there is no sampling literally enabled. Hence, run-time events and/or collected stacks can be utilized to organize profile data.
Group component 1220 is configured to ascribe profile data gathered utilizing the data collection component 1210 to specific groups, or buckets, as a function of organizational scheme 1214. In other words, profile data can be correlated and associated based on a descriptive mechanism that defines groups and relationships between groups. In one instance, groups can convey information about high-level functions of a program or sub-systems thereof (e.g., opening a document, recalculating layout, rendering a window . . . ). In furtherance of the foregoing, the group component 1220 can be configured to initialize data structures based on a given organizational scheme 1214 and populate groups with profile data. Other implementations are also possible including, but not limited to, tagging profiling data with group information. In any event, the result of processing performed by the group component 1220 is grouped data, which can be housed in a local or remotely accessible data store 1240.
The organizational scheme 1214 can define groupings in terms of function names (e.g., full function name within a binary file (e.g., executable)), for example. Groups can be expressed in hierarchical relationship of parents and children. However, groups can be constructed that are mutually exclusive of one another, for example by defining them as sibling nodes. Groups can include sub-groups, also called or categories, as child nodes, for example. Note, however, that unless otherwise explicitly noted, use of the term “group” or “groups” is intended to include sub-groups, or categories. Functions can be associated with groups at any level and can appear in an arbitrary number of groups. In other words, functions are not limited to being ascribed to a single group but instead can appear multiple groups at any level of granularity. As well, function information can be used to aggregate data, for example based on stack samples and/or an instrumented/event-based collection. For example, a binary file name and the function name can be utilized as bases for grouping. This can be helpful with respect to distinguishing functions that span multiple binary files. Groups can also define events that do not explicitly provide binary/function details, but that provide an event identifier, which is available in the collected data. Binary/function information in this case can be implied by the set of code locations that raise a specified event. In at least some operating systems, a stack will be available on generating an event to distinguish amongst multiple code locations raising the event. Further, note that a group hierarchy as provided herein can be independent of function execution paths. The reason for this is twofold. First, in the instrumented case, a notion of a stack can be employed for grouping data that is entirely decoupled from the actual execution stack. Second, in the sampling case, arbitrary unique stacks can be associated with the same group.
Priorities can also be designated for individual groups to assist in grouping data. More specifically, priority enables breaking of a default rule of organization based on the most current stack frame (e.g., literal code stack, virtual event start/stop pair stack). Such a priority value can be expressed relative to all other groups and used to assist in grouping/bucketing decisions. If data is under consideration for a group that has a lower specified priority than other candidate groups that exist to hold the data, the data can be attributed to an alternate group that has the highest explicit or implied priority. Groups defined in hierarchical relationships can have an implicit priority based on that hierarchy. If a parent “A” includes a child “B,” for example, “B” is regarded as a higher priority group over “A.” Accordingly, data collected with a stack “A::B” will therefore be associated with group “B.” However, if group “B” is explicitly specified at a lower priority than group “A,” then data will be grouped in group “A.”
Profile data can be correlated in various ways to permit flexible querying at various scopes, among other things. As examples, profiling data might extend across process boundaries or across machines. In one instance, profile data can be grouped by time in accordance with timestamps. At a more granular level, the profile data can be grouped by central processing units or threads. Further, a universal activity identifier could be used to track inter-thread communication in useful ways, for instance by tracking activity associated with a multi-threaded transaction.
In one instance, the organizational scheme 1214 can be embodied as a data file that can be authored, edited, and maintained out-of-band from, or independent of, other processes. Further, the organizational scheme 1214 can be easily passed around. This can allow code experts to capture a useful model of analyzing profile data for a program of which they are an expert and distribute this model to non-expert users. For example, an expert in the inner workings of a web browser could generate an organizational scheme and distribute the scheme to web application developers to aid performance tuning.
In one particular implementation, the organizational scheme 1214 can be expressed in XML (eXtensible Markup Language) thus providing general readability. In this implementation, and as employed with further description below, groups or subgroups are referred to as tags or tagsets, where a tag denotes a group and a tagset refers to set of groups. Examples of organizational schemes specified in XML are provided later herein.
Scheme generation component 1230 is configured to facilitate generation of the organizational scheme 1214. In one instance, a human user can manually author the organizational scheme 1214 optionally employing pattern matching or other filtering mechanisms (e.g., regular expressions, use of wild-card characters . . . ) to aid specification. The scheme generation component 1230 can enable automatic or semi-automatic (e.g., with user assistance) generation of groups, for instance based on available information including context information. By way of example, and not limitation, source control history, which stores information regarding changes made to a program, can be mined by the scheme generation component 1230 and used to automatically generate groups. For instance, a “code owner” group could be created to enable profile data to be broken down by team or individual owner of code. This allows users to identify experts who could assist with a performance problem easily.
Query processor component 1250 is configured to enable execution of queries over grouped data. Given a query, the query processor component 1250, utilizing known or novel mechanisms, can extract and return results that satisfy the query from the data store 1240. A visualization, or diagnostic tool, for example, can employ the query processor component 1250 to acquire data.
The visualization system 100 is configured to enable grouped profile data to be visualized. The profile data can be rendered in an arbitrary number of ways rather than being limited to conventional call trees. Further, since profile data is grouped at a logically meaningful level, visualizations can exploit groupings and further aid a human user in understanding profile data. In accordance with one embodiment, the visualization system 100 can spawn a graphical user interface that graphically depicts grouped profile data and enables interaction with the data. For example, a query can be authored over the grouped data by the visualization system 100, or a human user, and filtered results returned as a function of the query. Further, a mechanism can be provided to allow users to drill down to acquire more detailed data as well as rollup to view data at a more abstract/high-level.
While the program profile system 1200 includes the visualization system 100, it should be appreciated that visualization or diagnostic tools can be external to the program profile system 1200. In this scenario, interaction can be accomplished in a similar manner, for example by querying the profile system for requisite data. Additionally or alternatively, the program profile system 1200 can push profile data to the visualization system 100 as events where the visualization system is a subscriber to such an event stream. Furthermore, note that the visualization system 100 can be extendable in various ways. For example, initially the visualization system 100 can support a first set of visualizations. Subsequently, a second set of visualizations can be added by way of a third-party plugin or through an update/upgrade to the program profile system 1200, for instance.
Subscription component 1270 provides an additional or alternate manner of disseminating grouped data. More specifically, grouped data housed in data store 1240 can be delivered in accordance with a publish/subscribe model. The subscription component 1270 can offer and manage subscriptions and publish the grouped data to interested subscribers.
The program profile system 1200 is not limited to operating with respect to real-time execution of a program, or, stated differently, as profile data is collected. Additionally, the program profile system 1200 can operate with respect to historical data, or at a time post-collection. For example, an arbitrary organizational scheme can be overlaid on a persisted log to enable data to be viewed at a meaningful level in terms of program function and structure. In other words, an arbitrary number of views can be raised against trace data by applying alternate groupings.
As shown in
Still further yet, rather than encoding the organizational scheme 1214 in the program 1212, a process for generating the organizational scheme 1214 can be encoded in the program 1212. The scheme generation component 1230 can then automatically generate the organizational scheme 1214 based on the encoded process, for instance at runtime. As an example, at runtime, an arbitrary grouping can be created by binary name, such that all profile data collected during execution of a first binary is ascribed to the first binary and all profile data collected when a second binary is executed is ascribed to the second binary. Additionally, the scheme generation component 1230 can be configured to automatically update the organizational scheme 1214 in real-time, for example as data is being collected or as part of dynamically regenerating a new view on a collected log.
Combination component 1430 is configured to enable use of both instrumentation and sampling in various manners. For instance, data collected by each approach can augment the other approach. More specifically, data can be combined, utilized to cross check results, and/or statistically normalized, among other things. In this manner, advantages of both approached can be exploited and disadvantages can be mitigated. By way of example and not limitation, inherently less precise sample data can be supplemented with more precise instrumentation data.
Context detection component 1440 is configured to detect a context switch or in other words, a change in processing context. For example, a context switch can occur with regard to multiple processes sharing a CPU. In such a multitasking environment, a process can be halted and the state of a CPU stored such that the process can be resumed from the same point at a later time. Stated differently, one process is switched out of a central processing unit so that another process can run. Once a context switch is detected with respect to a program being profiled, data collection can be suspended until processing resumes or collected data can be marked such that the program profile system 1200 can differentiate data associated with a program from data that is not. In this manner, data can be excluded. For example, duration of function calls, of time-to-execute, can exclude time spent executing processes unrelated to a program being profiled, or in other words periods of inactivity with respect to a program being profiled due to a context switch.
Consider the following organizational scheme authored in XML in conjunction with
This organizational scheme concerns an instrumented approach to data collection including a plurality of start/stop event pairs for three arbitrary groups (a.k.a. tags) “A,” “B,” and “C.”
The following is a similar scenario except that sampling is the data collection mechanism rather than instrumentation. Consider the following organizational scheme specified in XML in conjunction with
Here, the organizational scheme is defined for groupings of stack samples. As shown in
Note also that an organizational scheme, such as the previous organizational scheme, can employ wild cards and/or pattern matching with respect to symbol/event names. Consider, for example, use of a star wildcard character as in “module=‘*’ method=‘malloc’”. Here, an allocation routine will be called across all modules. As another example consider “method=‘ClassOne::*’”. In this case, all members of “ClassOne” are identified.
Returning to
As per sampling approach, rather than having event start/stop pairs, a sampled stack and timestamp are employed. Groups can be defined with respect to one or more particular stacks. In other words, a subset of stacks is mapped to one or more logical groups. In the instrumentation approach, it is known when an event such as “C” fires. However, there is also a stack corresponding event “C” firing. Additionally, event “C” may be fired from multiple places, so there can many different stacks that can be attributed to group “A.” For example, it can be indicated that stack “A::B::C” and stack “X::Y::C” are indicative of event “C” and maps to group “A” but stack “D::E::C” does not. In other words, empiric code conditions are detected that indicate that event “C” might have fired and as such associated profile data should be ascribed to logical group “A.” With instrumentation, it is possible to determine precisely what is happening in a program, whereas sampling is an approximation. Sampling essentially indicates that a certain percentage of the time when the program was checked profiling data was ascribed to logical group A.
When working with event start/stop pairs, these pairs lead to logical groupings and allow ascription of associated profile data and aggregation of intervening data points without consulting a stack. For example, when an event such as rendering starts, bytes are allocated, and subsequently the rendering event stops, the bytes allocated are ascribed to group “A,” for example. There is no need to consult a stack. However, if event start/stop pairs are not employed, there can be a stack associated with bytes allocated. In this case, it can be determined or inferred that the bytes allocated are attributable to group “A.” Therefore, stacks in combination with additional data allow a system to work back to the same event start/stop pair grouping. In another scenario, where solely sample stacks are employed time to execute sampling can be employed. Here, the information able to be revealed is the approximate time spent executing in a logical group or the like. In other words, the samples themselves are logically grouped.
It is to be appreciated that grouping information need not be limited to that which is expressed in an organizational scheme to organize profile data. By way of example, consider a data stream that interleaves grouping start/stop event pairs with another event of interest such as bytes allocated. In this case, the grouping start/stop event pairs can be utilized as a basis for grouping the bytes allocated event. In other words, bytes allocated can be ascribed to a group associated with the event start/stop pairs. Of course, stack-grouping information expressed in an organizational scheme can be utilized and applied to a callback associated with an event payload, for instance.
The aforementioned systems, architectures, environments, and the like have been described with respect to interaction between several components. It should be appreciated that such systems and components can include those components or sub-components specified therein, some of the specified components or sub-components, and/or additional components. Sub-components could also be implemented as components communicatively coupled to other components rather than included within parent components. Further yet, one or more components and/or sub-components may be combined into a single component to provide aggregate functionality. Communication between systems, components and/or sub-components can be accomplished in accordance with either a push and/or pull model. The components may also interact with one or more other components not specifically described herein for the sake of brevity, but known by those of skill in the art.
Furthermore, various portions of the disclosed systems above and methods below can include or employ of artificial intelligence, machine learning, or knowledge or rule-based components, sub-components, processes, means, methodologies, or mechanisms (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines, classifiers . . . ). Such components, inter alia, can automate certain mechanisms or processes performed thereby to make portions of the systems and methods more adaptive as well as efficient and intelligent. By way of example, and not limitation, the visualization system 100 can utilize such mechanism to infer visualizations based on historical and contextual information.
In view of the exemplary systems described supra, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of
Referring to
The word “exemplary” or various forms thereof are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Furthermore, examples are provided solely for purposes of clarity and understanding and are not meant to limit or restrict the claimed subject matter or relevant portions of this disclosure in any manner. It is to be appreciated a myriad of additional or alternate examples of varying scope could have been presented, but have been omitted for purposes of brevity.
As used herein, the terms “component,” and “system,” as well as various forms thereof (e.g., components, systems, sub-systems . . . ) are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an instance, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
The conjunction “or” as used this description and appended claims in is intended to mean an inclusive “or” rather than an exclusive “or,” unless otherwise specified or clear from context. In other words, “‘X’ or ‘Y’” is intended to mean any inclusive permutations of “X” and “Y.” For example, if “‘A’ employs ‘X,’” “‘A employs ‘Y,’” or “‘A’ employs both ‘X’ and ‘Y,’” then “‘A’ employs ‘X’ or ‘Y’” is satisfied under any of the foregoing instances.
As used herein, the term “inference” or “infer” refers generally to the process of reasoning about or inferring states of the system, environment, and/or user from a set of observations as captured via events and/or data. Inference can be employed to identify a specific context or action, or can generate a probability distribution over states, for example. The inference can be probabilistic—that is, the computation of a probability distribution over states of interest based on a consideration of data and events. Inference can also refer to techniques employed for composing higher-level events from a set of events and/or data. Such inference results in the construction of new events or actions from a set of observed events and/or stored event data, whether or not the events are correlated in close temporal proximity, and whether the events and data come from one or several event and data sources. Various classification schemes and/or systems (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines . . . ) can be employed in connection with performing automatic and/or inferred action in connection with the claimed subject matter.
Furthermore, to the extent that the terms “includes,” “contains,” “has,” “having” or variations in form thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
In order to provide a context for the claimed subject matter,
While the above disclosed system and methods can be described in the general context of computer-executable instructions of a program that runs on one or more computers, those skilled in the art will recognize that aspects can also be implemented in combination with other program modules or the like. Generally, program modules include routines, programs, components, data structures, among other things that perform particular tasks and/or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the above systems and methods can be practiced with various computer system configurations, including single-processor, multi-processor or multi-core processor computer systems, mini-computing devices, mainframe computers, as well as personal computers, hand-held computing devices (e.g., personal digital assistant (PDA), phone, watch . . . ), microprocessor-based or programmable consumer or industrial electronics, and the like. Aspects can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. However, some, if not all aspects of the claimed subject matter can be practiced on stand-alone computers. In a distributed computing environment, program modules may be located in one or both of local and remote memory storage devices.
With reference to
The processor(s) 2020 can be implemented with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine. The processor(s) 2020 may also be implemented as a combination of computing devices, for example a combination of a DSP and a microprocessor, a plurality of microprocessors, multi-core processors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The computer 2010 can include or otherwise interact with a variety of computer-readable media to facilitate control of the computer 2010 to implement one or more aspects of the claimed subject matter. The computer-readable media can be any available media that can be accessed by the computer 2010 and includes volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media.
Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to memory devices (e.g., random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM) . . . ), magnetic storage devices (e.g., hard disk, floppy disk, cassettes, tape . . . ), optical disks (e.g., compact disk (CD), digital versatile disk (DVD) . . . ), and solid state devices (e.g., solid state drive (SSD), flash memory drive (e.g., card, stick, key drive . . . ) . . . ), or any other medium which can be used to store the desired information and which can be accessed by the computer 2010.
Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
Memory 2030 and mass storage 2050 are examples of computer-readable storage media. Depending on the exact configuration and type of computing device, memory 2030 may be volatile (e.g., RAM), non-volatile (e.g., ROM, flash memory . . . ) or some combination of the two. By way of example, the basic input/output system (BIOS), including basic routines to transfer information between elements within the computer 2010, such as during start-up, can be stored in nonvolatile memory, while volatile memory can act as external cache memory to facilitate processing by the processor(s) 2020, among other things.
Mass storage 2050 includes removable/non-removable, volatile/non-volatile computer storage media for storage of large amounts of data relative to the memory 2030. For example, mass storage 2050 includes, but is not limited to, one or more devices such as a magnetic or optical disk drive, floppy disk drive, flash memory, solid-state drive, or memory stick.
Memory 2030 and mass storage 2050 can include, or have stored therein, operating system 2060, one or more applications 2062, one or more program modules 2064, and data 2066. The operating system 2060 acts to control and allocate resources of the computer 2010. Applications 2062 include one or both of system and application software and can exploit management of resources by the operating system 2060 through program modules 2064 and data 2066 stored in memory 2030 and/or mass storage 2050 to perform one or more actions. Accordingly, applications 2062 can turn a general-purpose computer 2010 into a specialized machine in accordance with the logic provided thereby.
All or portions of the claimed subject matter can be implemented using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to realize the disclosed functionality. By way of example and not limitation, the visualization system 100, or portions thereof, can be, or form part, of an application 2062, and include one or more modules 2064 and data 2066 stored in memory and/or mass storage 2050 whose functionality can be realized when executed by one or more processor(s) 2020.
In accordance with one particular embodiment, the processor(s) 2020 can correspond to a system on a chip (SOC) or like architecture including, or in other words integrating, both hardware and software on a single integrated circuit substrate. Here, the processor(s) 2020 can include one or more processors as well as memory at least similar to processor(s) 2020 and memory 2030, among other things. Conventional processors include a minimal amount of hardware and software and rely extensively on external hardware and software. By contrast, an SOC implementation of processor is more powerful, as it embeds hardware and software therein that enable particular functionality with minimal or no reliance on external hardware and software. For example, the visualization system 100 and/or associated functionality can be embedded within hardware in a SOC architecture.
The computer 2010 also includes one or more interface components 2070 that are communicatively coupled to the system bus 2040 and facilitate interaction with the computer 2010. By way of example, the interface component 2070 can be a port (e.g., serial, parallel, PCMCIA, USB, FireWire . . . ) or an interface card (e.g., sound, video . . . ) or the like. In one example implementation, the interface component 2070 can be embodied as a user input/output interface to enable a user to enter commands and information into the computer 2010 through one or more input devices (e.g., pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, camera, other computer . . . ). In another example implementation, the interface component 2070 can be embodied as an output peripheral interface to supply output to displays (e.g., CRT, LCD, plasma . . . ), speakers, printers, and/or other computers, among other things. Still further yet, the interface component 2070 can be embodied as a network interface to enable communication with other computing devices (not shown), such as over a wired or wireless communications link.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the disclosed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims.