When writing code during the development of software applications, developers commonly spend a significant amount of time “debugging” the code to find runtime and other source code errors. In doing so, developers may take several approaches to reproduce and localize a source code bug, such as observing the behavior of a program based on different inputs, inserting debugging code (e.g., to print variable values, to track branches of execution, etc.), temporarily removing code portions, etc. Tracking down runtime errors to pinpoint code bugs can occupy a significant portion of application development time.
Many types of debugging applications (“debuggers”) have been developed in order to assist developers with the code debugging process. These tools offer developers the ability to trace, visualize, and alter the execution of computer code. For example, debuggers may visualize the execution of code instructions, may present code variable values at various times during code execution, may enable developers to alter code execution paths, and/or may enable developers to set “breakpoints” and/or “watchpoints” on code elements of interest (which, when reached during execution, causes execution of the code to be suspended), among other things.
An emerging form of debugging applications enable “time travel,” “reverse,” or “historic” debugging. With “time travel” debugging, execution of a program (e.g., executable entities such as threads) is recorded/traced by a trace application into one or more trace data streams that record a “bit accurate” trace of that execution. These trace data stream(s) can then be used to replay execution of the program later, for both forward and backward analysis. For example, “time travel” debuggers can enable a developer to set forward breakpoints/watchpoints (like conventional debuggers) as well as reverse breakpoints/watchpoints.
At least some embodiments described herein leverage the wealth of information recorded in bit-accurate time travel traces to provide rich debugging experiences, including providing one or more visualizations of historical state associated with code element(s) that are part of a prior execution of an entity. For example, embodiments of a debugger could leverage time travel trace data to present last known and/or next known data values of one or more variables and/or data structures, to present last known and/or next known return values of one or more functions, etc. Visualizations of such data could include indicating that a value displayed for a code element at a particular point in execution is a current value, indicating that the value displayed for a code element at the particular point in execution is a next known value, indicating that the value displayed for a code element at the particular point in execution is a previously known value, presenting a history of values for a code element (e.g., in a timeline view, swimlane view, etc.), and the like.
Embodiments can include methods, systems, and computer program products for presenting historical state associated with a code element that is part of a prior execution of an entity. An example embodiment includes replaying one or more segments of the prior execution of the entity based on one or more trace data streams storing a trace of the prior execution of the entity. The example embodiment also includes, based on replaying the one or more segments of the prior execution of the entity, presenting historical state associated with the code element. Presenting historical state can include, in connection with a first execution time point in the prior execution of the entity, presenting at a user interface a first state of the code element, the first state of the code element being based on a first memory access associated with the code element at the first execution time point. Presenting historical state can also include, in connection with a different execution time point in the prior execution of the entity, presenting at the user interface the first state of the code element along with an indication that the first state of the code element is at least one of (i) a last known state based on the different execution time point being after the first execution time point, but prior to a second memory access associated with the code element at a second execution time point in the prior execution of the entity, or (ii) a next known state based on the different execution time point being prior to the first execution time point, but after a third memory access associated with the code element at a third execution time point in the prior execution of the entity.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In order to describe the manner in which the above-recited and other advantages and features of the invention can be obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
At least some embodiments described herein leverage the wealth of information recorded in bit-accurate time travel traces to provide rich debugging experiences, including providing one or more visualizations of historical state associated with code element(s) that are part of a prior execution of an entity. For example, embodiments of a debugger could leverage time travel trace data to present last known and/or next known data values of one or more variables and/or data structures, to present last known and/or next known return values of one or more functions, etc. Visualizations of such data could include indicating that a value displayed for a code element at a particular point in execution is a current value, indicating that the value displayed for a code element at the particular point in execution is a next known value, indicating that the value displayed for a code element at the particular point in execution is a previously known value, presenting a history of values for a code element (e.g., in a timeline view, swimlane view, etc.), and the like.
The visualization embodiments described herein provide a richness of data not available in prior forms of debugging, and that can greatly enhance the ability of a debugger to present the operation of program code. By providing this richness of data, the visualization embodiments described herein provide improvements to the functioning of computers, and particularly those that are used for code debugging. For example, the visualization embodiments described herein enable computer systems to do things they could not do before—i.e., leveraging time travel trace data to present and interact with historical state associated with code element(s) that are part of a prior execution of an entity, in the various manners described herein. In doing so, these visualization embodiments also improve the efficiency of use of a computer system during debugging and can, therefore, dramatically decrease the amount of time it takes to debug code. Improving the efficiency of use of a computer system during debugging could also reduce an overall time spent by a developer using computing resources during the debugging process.
Embodiments within the scope of the present invention include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures. Such computer-readable media can be any available media that can be accessed by the computer system 101. Computer-readable media that store computer-executable instructions and/or data structures are computer storage devices. Computer-readable media that carry computer-executable instructions and/or data structures are transmission media. Thus, by way of example, and not limitation, embodiments of the invention can comprise at least two distinctly different kinds of computer-readable media: computer storage devices and transmission media.
Computer storage devices are physical hardware devices that store computer-executable instructions and/or data structures. Computer storage devices include various computer hardware, such as RAM, ROM, EEPROM, solid state drives (“SSDs”), flash memory, phase-change memory (“PCM”), optical disk storage, magnetic disk storage or other magnetic storage devices, or any other hardware device(s) which can be used to store program code in the form of computer-executable instructions or data structures, and which can be accessed and executed by the computer system 101 to implement the disclosed functionality of the invention. Thus, for example, computer storage devices may include the depicted system memory 103, the depicted data store 104 which can store computer-executable instructions and/or data structures, or other storage such as on-processor storage, as discussed later.
Transmission media can include a network and/or data links which can be used to carry program code in the form of computer-executable instructions or data structures, and which can be accessed by the computer system 101. A “network” is defined as one or more data links that enable the transport of electronic data between computer systems and/or modules and/or other electronic devices. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer system, the computer system may view the connection as transmission media. Combinations of the above should also be included within the scope of computer-readable media. For example, the input/output hardware 105 may comprise hardware (e.g., a network interface module (e.g., a “NIC”)) that connects a network and/or data link which can be used to carry program code in the form of computer-executable instructions or data structures.
Further, upon reaching various computer system components, program code in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to computer storage devices (or vice versa). For example, computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a NIC (e.g., input/output hardware 105), and then eventually transferred to the system memory 103 and/or to less volatile computer storage devices (e.g., data store 104) at the computer system 101. Thus, it should be understood that computer storage devices can be included in computer system components that also (or even primarily) utilize transmission media.
Computer-executable instructions comprise, for example, instructions and data which, when executed at the processor(s) 102, cause the computer system 101 to perform a certain function or group of functions. Computer-executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code.
Those skilled in the art will appreciate that the invention may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, tablets, pagers, routers, switches, and the like. The invention may also be practiced in distributed system environments where local and remote computer systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks. As such, in a distributed system environment, a computer system may include a plurality of constituent computer systems. In a distributed system environment, program modules may be located in both local and remote memory storage devices.
Those skilled in the art will also appreciate that the invention may be practiced in a cloud computing environment. Cloud computing environments may be distributed, although this is not required. When distributed, cloud computing environments may be distributed internationally within an organization and/or have components possessed across multiple organizations. In this description and the following claims, “cloud computing” is defined as a model for enabling on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services). The definition of “cloud computing” is not limited to any of the other numerous advantages that can be obtained from such a model when properly deployed.
A cloud computing model can be composed of various characteristics, such as on-demand self-service, broad network access, resource pooling, rapid elasticity, measured service, and so forth. A cloud computing model may also come in the form of various service models such as, for example, Software as a Service (“SaaS”), Platform as a Service (“PaaS”), and Infrastructure as a Service (“IaaS”). The cloud computing model may also be deployed using different deployment models such as private cloud, community cloud, public cloud, hybrid cloud, and so forth.
Some embodiments, such as a cloud computing environment, may comprise a system that includes one or more hosts that are each capable of running one or more virtual machines. During operation, virtual machines emulate an operational computing system, supporting an operating system and perhaps one or more other applications as well. In some embodiments, each host includes a hypervisor that emulates virtual resources for the virtual machines using physical resources that are abstracted from view of the virtual machines. The hypervisor also provides proper isolation between the virtual machines. Thus, from the perspective of any given virtual machine, the hypervisor provides the illusion that the virtual machine is interfacing with a physical resource, even though the virtual machine only interfaces with the appearance (e.g., a virtual resource) of a physical resource. Examples of physical resources including processing capacity, memory, disk space, network bandwidth, media drives, and so forth.
Each processing unit 102a executes processor instructions that are defined by applications (e.g., tracer 104a, operating kernel 104e, application 104f, etc.), and which instructions are selected from among a predefined processor instruction set architecture (ISA). The particular ISA of each processor 102 varies based on processor manufacturer and processor model. Common ISAs include the IA-64 and IA-32 architectures from INTEL, INC., the AMD64 architecture from ADVANCED MICRO DEVICES, INC., and various Advanced RISC Machine (“ARM”) architectures from ARM HOLDINGS, PLC, although a great number of other ISAs exist and can be used by the present invention. In general, an “instruction” is the smallest externally-visible (i.e., external to the processor) unit of code that is executable by a processor.
Each processing unit 102a obtains processor instructions from one or more processor cache(s) 102b and executes the processor instructions based on data in the cache(s) 102b, based on data in registers 102d, and/or without input data. In general, each cache 102b is a small amount (i.e., small relative to the typical amount of system memory 103) of random-access memory that stores on-processor copies of portions of a backing store, such as the system memory 103 and/or another cache in the cache(s) 102b. For example, when executing the application code 103a, one or more of the cache(s) 102b contain portions of the application runtime data 103b. If the processing unit(s) 102a request data not already stored in a particular cache 102b, then a “cache miss” occurs, and that data is fetched from the system memory 103 or another cache, potentially “evicting” some other data from that cache 102b. The cache(s) 102b may include code cache portions and data cache portions. When executing the application code 103a, the code portion(s) of the cache(s) 102b may store at least a portion of the processor instructions stored in the application code 103a and the data portion(s) of the cache(s) 102b may store at least a portion of data structures of the application runtime data 103b.
Each processor 102 also includes microcode 102c, which comprises control logic (i.e., executable instructions) that control operation of the processor 102, and which generally functions as an interpreter between the hardware of the processor and the processor ISA exposed by the processor 102 to executing applications. The microcode 102 is typically embodied on on-processor storage, such as ROM, EEPROM, etc.
Registers 102d are hardware-based storage locations that are defined based on the ISA of the processors(s) 102 and that are read from and/or written to by processor instructions. For example, registers 102d are commonly used to store values fetched from the cache(s) 102b for use by instructions, to store the results of executing instructions, and/or to store status or state—such as some of the side-effects of executing instructions (e.g., the sign of a value changing, a value reaching zero, the occurrence of a carry, etc.), a processor cycle count, etc. Thus, some registers 102d may comprise “flags” that are used to signal some state change caused by executing processor instructions. In some embodiments, processors 102 may also include control registers, which are used to control different aspects of processor operation. Although
The data store 104 can store computer-executable instructions representing application programs such as, for example, a tracer 104a, an indexer 104b, a debugger 104c, an operating system kernel 104e, an application 104f (e.g., the application that is the subject of tracing by the tracer 104a). When these programs are executing (e.g., using the processor(s) 102), the system memory 103 can store corresponding runtime data, such as runtime data structures, computer-executable instructions, etc. Thus,
The tracer 104a is usable to record a bit-accurate trace of execution of one or more entities, such as one or more threads of an application 104f or kernel 104e, and to store the trace data into the trace data store 104d. In some embodiments, the tracer 104a is a standalone application, while in other embodiments the tracer 104a is integrated into another software component, such as the kernel 104e, a hypervisor, a cloud fabric, etc. While the trace data store 104d is depicted as being part of the data store 104, the trace data store 104d may also be embodied, as least in part, in the system memory 103, in the cache(s) 102b, or at some other storage device.
In some embodiments, the tracer 104a records a bit-accurate trace of execution of one or more entities. As used herein, a “bit accurate” trace is a trace that includes sufficient data to enable code that was previously executed at one or more processing units 102a to be replayed, such that it executes in substantially the same manner at replay time as it did during tracing. There are a variety of approaches the tracer 104a might use to record bit-accurate traces. Two different families of approaches that provide high levels of performance and reasonable trace size are now briefly summarized, though it will be appreciated that the embodiments herein can operate in connection with traces recorded using other approaches. Additionally, optimizations could be applied to either of these families of approaches that, for brevity, are not described herein.
A first family of approaches is built upon the recognition that processor instructions (including virtual machine “virtual processor” instructions) generally fall into one of three categories: (1) instructions identified as “non-deterministic” as not producing predictable outputs because their outputs are not fully determined by data in general registers 102d or the cache(s) 102b, (2) deterministic instructions whose inputs do not depend on memory values (e.g., they depend only on processor register values, or values defined in the code itself), and (3) deterministic instructions whose inputs depend on reading values from memory. Thus, in some embodiments, storing enough state data to reproduce the execution of instructions can be accomplished by addressing: (1) how to record non-deterministic instructions that produce output not fully determined by their inputs, (2) how to reproduce the values of input registers for instructions depending on registers, and (3) how to reproduce the values of input memory for instructions depending on memory reads.
In some embodiments, the first approach(es) for recording traces records non-deterministic instructions that produce output not fully determined by their inputs by storing into the trace data store 104d the side-effects of execution of such instructions. As used herein, “non-deterministic” instructions can include somewhat less common instructions that (i) produce non-deterministic output each time they are executed (e.g., RDTSC on INTEL processors, which writes the number of processor cycles since the last processor reset into a register), that (ii) may produce a deterministic output, but depend on inputs not tracked by the tracer 104a (e.g. debug registers, timers, etc.), and/or that (iii) produce processor-specific information (e.g., CPUID on INTEL processors, which writes processor-specific data into registers). Storing the side-effects of execution of such instructions may include, for example, storing register values and/or memory values that were changed by execution of the instruction. In some architectures, such as from INTEL, processor features such as those found in Virtual Machine eXtensions (VMX) could be used to trap instructions for recording their side effects into the trace data store 104d.
Addressing how to reproduce the values of input registers for deterministic instructions (e.g., whose inputs depend only on processor register values) is straightforward, as they are the outputs of the execution of the prior instruction(s). Thus, the first approach(es) for recording traces can therefore reduce recording the execution of an entire series of processor instructions into the trace data store 104d to reproducing the register values at the beginning of the series; the trace data in the trace data store 104d need not store a record of which particular instructions executed in the series, or the intermediary register values. This is because the actual instructions are available from the application code 103a, itself. These instructions can therefore be supplied the recorded inputs (i.e., the recorded initial set of register values) during reply, to execute in the same manner as they did during the trace.
Finally, the first approach(es) for recording traces can address how to reproduce the values of input memory for deterministic instructions whose inputs depend on memory values by recording into the trace data store 104d the memory values that these instructions consumed (i.e., the reads)—irrespective of how the values that the instructions read were written to memory. In other words, some embodiments include recording only memory reads, but not memory writes. For example, although values may be written to memory by a current thread, by another thread (including the kernel, e.g., as part of processing an interrupt), or by a hardware device (e.g., input/output hardware 105), it is just the values that the thread's instructions read that are needed for full replay of instructions of the thread that performed the reads. This is because it is that values that were read by the thread (and not necessarily all the values that were written to memory) that dictated how the thread executed.
These first approach(es) for recording traces can be implemented fully in software (e.g., as part of a kernel or hypervisor, or as part of a fully-virtualized environment), or can be implemented with the assistance of hardware. For example, they could be implemented by modifications to the processor(s) 102 that assist in determining what to log and/or in actually writing trace data to a trace buffer (e.g., buffer 102e or a reserved portion of the cache(s) 102b) or file.
A second family of approaches for recording bit-accurate traces is built on the recognition that the processor 102 (including the cache(s) 102b) form a semi- or quasi-closed system. For example, once portions of data for a process (i.e., code data and runtime application data) are loaded into the cache(s) 102b, the processor 102 can run by itself—without any input—as a semi- or quasi-closed system for bursts of time. In particular, once the cache(s) 102b are loaded with data, one or more of the processing units 102a execute instructions from the code portion(s) of the cache(s) 102b, using runtime data stored in the data portion(s) of the cache(s) 102b and using the registers 102d. When a processing unit 102a needs some influx of information (e.g., because an instruction it is executing, will execute, or may execute accesses code or runtime data not already in the cache(s) 102b), a “cache miss” occurs and that information is brought into the cache(s) 102b from the system memory 103. For example, if a data cache miss occurs when an executed instruction performs a memory operation at a memory address within the application runtime data 103b, data from that memory address is brought into one of the cache lines of the data portion of the cache(s) 102b. Similarly, if a code cache miss occurs when an instruction performs a memory operation at a memory address application code 103a stored in system memory 103, code from that memory address is brought into one of the cache lines of the code portion(s) of the cache(s) 102b. The processing unit 102a then continues execution using the new information in the cache(s) 102b until new information is again brought into the cache(s) 102b (e.g., due to another cache miss or an un-cached read).
Thus, in the second family of approaches, the tracer 104a can record sufficient data to be able to reproduce the influx of information into the cache(s) 102b as a traced processing unit executes. Four example implementations within this second family of approaches are now described, though it will be appreciated that these are not exhaustive. Since the second approach(es) for recording traces rely closely on operation of the cache(s) 102b, they are typically implemented with hardware assistance. For example, they could be implemented by modifications to the processor(s) 102 that assist in determining what cache events occurred, in determining what to log, and/or in actually writing trace data to a trace buffer (e.g., buffer 102e or a reserved portion of the cache(s) 102b) or file. However, they could also be implemented in a fully virtualized environment (e.g., in which a processor is fully virtualized).
First implementation(s) could record into the trace data store 104d all of the data brought into the cache(s) 102b by logging all cache misses and un-cached reads (i.e., reads from hardware components and un-cacheable memory), along with a time during execution at which each piece of data was brought into the cache(s) 102b (e.g., using a count of instructions executed or some other counter). The effect is to therefore record a log of all the data that was consumed by a traced processing unit 102a during code execution. However, due to alternate execution of plural threads and/or speculative execution, the first implementation(s) could record more data than is strictly necessary to replay execution of the traced code.
Second implementation(s) in the second family of approaches improves on the first implementation(s) by tracking and recording only the cache lines that were “consumed” by each processing unit 102a, and/or tracking and recording only subset(s) of cache lines that are being used by processing units 102a that are participating in tracing—rather than recording all the cache misses. As used herein, a processing unit has “consumed” a cache line when it is aware of the cache line's present value. This could be because the processing unit is the one that wrote the present value of the cache line, or because the processing unit performed a read on the cache line. Some embodiments track consumed cache lines with extensions to one or more of the cache(s) 102b (e.g., additional “logging” or “accounting” bits) that enable the processor 102 to identify, for each cache line, one or more processing units 102a that consumed the cache line. Embodiments can track subset(s) of cache lines that are being used by processing units 102a that are participating in tracing through use of way-locking in associative caches—for example, the processor 102 can devote a subset of ways in each address group of an associative cache to tracked processing units, and log only cache misses relating to those ways.
Third implementation(s) in the second family of approaches could additionally, or alternatively, be built on top a cache coherence protocol (CCP) used by the cache(s) 102b. In particular, the third implementation(s) could us the CCP to determine a subset of the “consumed” cache lines to record into the trace data store 104d, and which will still enable activity of the cache(s) 102b to be reproduced. This approach could operate at a single cache level (e.g., L1) and log influxes of data to that cache level, along with a log of CCP operations at the granularity of the processing unit that caused a given CCP operation. This includes logging which processing unit(s) previously had read and/or write access to a cache line.
Fourth implementation(s) could also utilize CCP data, but operate at two or more cache levels—logging influxes of data to an “upper-level” shared cache (e.g., at an L2 cache), while, using a CCP of at least one “lower-level” cache (e.g., a CCP one more L1 caches) to log a subset of CCP state transitions for each cached memory location (i.e., between sections of “load” operations and sections of “store” operations). The effect is to log less CCP data than the third implementation(s) (i.e., by recording far less CCP state data than the third implementation(s), since the fourth implementation(s) record based on load/store transitions rather than per-processing unit activity). Such logs could be post-processed and augmented to reach the level of detail recorded in the third implementation(s), but may potentially be built into silicon using less costly hardware modifications than the third implementation(s) (e.g., because less CCP data needs to be tracked and recorded by the processor 102).
Fifth implementation(s) could also operate at two or more cache levels. However, rather than logging influxes of data to an upper-level shared cache, as in the fourth implementation(s), the fifth implementation(s) track influxes to a lower-level cache and then leverage knowledge of one or more upper-level caches to determine if, and how, to log the influx. One variant could include processor logic that detects an influx to the lower-level cache, and then checks one or more of the upper-level caches to see if the upper-level cache(s) have knowledge (e.g., accounting bits, CCP data, etc.) that could prevent the influx from being logged or that could enable the influx to be logged by reference to a prior log entry. Then, this variant can log the influx at the lower-layer cache, if necessary, either by value or by reference. Another variant could include processor logic at the lower-level cache that detects the influx, and that sends a logging request to an upper-level cache. The upper-level cache then uses its knowledge (e.g., accounting bits, CCP data, etc.) to determine if and how the influx should be logged (e.g., by value or by reference), and/or to pass the request to another upper-level cache (if appropriate) to repeat the process. The influx is then logged, if appropriate, by an upper-level cache, or by the lower-level cache based on an instruction from an upper-level cache. These fifth implementation(s) could, in some environments, gain many of the benefits of the fourth implementation(s) with fewer hardware modifications.
Regardless of the recording approach used by the tracer 104a, it can record the trace data into the one or more trace data stores 104d. As examples, a trace data store 104d may include one or more trace files, one or more areas of physical memory, one or more areas of a processor cache (e.g., L2 or L3 cache), or any combination or multiple thereof. A trace data store 104d could include one or more trace data streams. In some embodiments, for example, multiple entities (e.g., processes, threads, etc.), could each be traced to a separate trace file or a trace data stream within a given trace file. Alternatively, data packets corresponding to each entity could be tagged such that they are identified as corresponding to that entity. If multiple related entities are being traced (e.g., plural threads of the same process), the trace data for each entity could be traced independently (enabling them to be replayed independently), though any events that are orderable across the entities (e.g., access to shared memory) can be identified with one or more series of sequencing numbers (e.g., monotonically incrementing or decrementing values) that are global across the independent traces. The trace data store 104d can be configured for flexible management, modification, and/or creation of trace data streams. For example, modification of an existing trace data stream could involve modification of an existing trace file, replacement of sections of trace data within an existing file, and/or creation of a new trace file that includes the modifications.
In some implementations, the tracer 104a can continually append to trace data stream(s) such that trace data continually grows during tracing. In other implementations, however, the trace data streams could be implemented as one or more ring buffers. In such implementation, the oldest trace data is removed from the data stream(s) as new trace data is added to the trace data store 104d. As such, when the trace data streams are implemented as buffer(s), they contain a rolling trace of the most recent execution at the traced process(es). Use of ring buffers may enable the tracer 104a to engage in “always on” tracing, even in production systems. In some implementations, tracing can be enabled and disabled at practically any time. As such, whether tracing to a ring buffer or appending to a traditional trace data stream, the trace data could include gaps between periods during which tracing is enabled.
The trace data store 104d can include information that helps facilitate efficient trace replay and searching over the trace data. For example, trace data can include periodic key frames that enable replay of a trace data stream to be commenced from the point of the key frame. Key frames can include, for example, the values of all processor registers 102d needed to resume replay. Trace data could also include memory snapshots (e.g., the values of one or more memory addresses at a given time) reverse lookup data structures (e.g., identifying information in the trace data based on memory addresses as keys), and the like.
Even when using the efficient tracing mechanisms described above, there may be practical limits to the richness of information that can be stored into the trace data store 104d during tracing by the tracer 104a. This may be due to an effort to reduce memory usage, processor usage, and/or input/output bandwidth usage during tracing (i.e., to reduce the impact of tracing on the application(s) being traced), and/or to reduce the amount of trace data generated (i.e., reducing the disk space usage). As such, even though a trace data can include rich information, such as key frames, memory snapshots, and/or reverse lookup data structures, the tracer 104a may limit how frequently this information is recorded to the trace data store 104d, or even omit some of these types of information altogether.
To overcome these limitations, embodiments can include the indexer 104b, which takes the trace data generated by the tracer 104a as input, and which performs transformation(s) to this trace data to improve the performance of consumption of the trace data (or derivatives thereof) by the debugger 104c. For example, the indexer 104b could add key frames, memory snapshots, reverse lookup data structures, etc. The indexer 104b could augment the existing trace data, and/or could generate new trace data containing the new information. The indexer 104b can operate based on a static analysis of the trace data, and/or can perform a runtime analysis (e.g., based on replaying one or more portions of the trace data).
The debugger 104c is usable to consume (e.g., replay) the trace data generated by the tracer 104a into the trace data store 104d, including any derivatives of the trace data that were generated by the indexer 104b (executing at the same, or another, computer system), in order to assist a user in performing debugging actions on the trace data (or derivatives thereof). For example, the debugger 104c could present one or more debugging interfaces (e.g., user interfaces and/or application programming interfaces), replay prior execution of one or more portions of the application 104f, set breakpoints/watchpoints including reverse breakpoints/watchpoints, enable queries/searches over the trace data, etc.
While the tracer 104a, the indexer 104b, and the debugger 104c are depicted (for clarity) as separate entities, it will be appreciated that one or more of these entities could be combined (e.g., as sub-components) into a single entity. For example, a debugging suite could comprise each of the tracer 104a, the indexer 104b, and the debugger 104c. In another example, a tracing suite could include the tracer 104a and the indexer 104b, and a debugging suite could comprise the debugger 104c; alternatively, the tracing suite could include the tracer 104a, and the debugging suite could comprise the indexer 104b and the debugger 104c. Other variations are of course possible.
Notably, the tracer 104a, the indexer 104b, and the debugger 104c need not all exist at the same computer system. For example, a tracing suite could be executed at one or more first computer systems (e.g., a production environment, a testing environment, etc.), while a debugging suite could be executed at one or more second computer systems (e.g., a developer's computer, a distributed computing system that facilitates distributed replay of trace data, etc.). Also, as depicted, the tracer 104a, the indexer 104b, and/or the debugger 104c may each have access the trace data store 104d, either directly or indirectly, regardless of where the tracer 104a, the indexer 104b, the debugger 104c and/or the trace data store 104d actually reside (i.e., as indicated by the solid arrows).
The trace access component 201 accesses trace data stored in the trace data store 104d. Thus, in connection with debugging a prior execution of an entity that is the subject of debugging (e.g., one or more threads associated with a prior execution of application 1040, the trace access component 201 can access one or more trace data streams that capture a bit-accurate trace of the prior execution of the entity. The trace access component 201 could access original trace data streams recorded by the tracer 104a and/or one or more indexed trace data streams indexed by the indexer 104b. For example, referring to
The replay component 202 replays one or more segments of the prior execution of the entity, based on replaying one or more of the plurality of independently-replayable segments of the trace data stream(s) accessed by the trace access component 201. As such, the replay component 202 can reproduce program execution state at any point in the traced execution of the entity that is the subject of debugging. This state includes, for example, the values of processor registers 102d and the values of memory read from and/or written to (e.g., in the cache(s) 102b). For example,
The breakpoint/watchpoint component 203 enables specification of code and/or data elements in the application 104f which, when encountered or changed by the replay component 202, should cause replay activity by the replay component 202 to be paused/suspended. Breakpoints/watchpoints are typically user-defined (e.g., using a debugger user interface), but they could also be automatically (or semi-automatically) generated (e.g., based on a search or query at the debugger). Thus, for example, the breakpoint/watchpoint component 203 can enable a breakpoint to be set on a particular line in code defining the application 104f; then, when this line of code is encountered by the replay component 202 the breakpoint/watchpoint component 203 can cause replay to pause. In another example, the breakpoint/watchpoint component 203 can enable a watchpoint to be set on a particular variable or data structure; then, when a value of this variable or data structure is changed during replay by the replay component 202 the breakpoint/watchpoint component 203 can cause replay to pause.
The bookmark component 204 enables specification of bookmarks at particular execution time points in the prior execution of the entity. Unlike breakpoint/watchpoints, which are bound to code elements (e.g., lines of code, variables, data structures, etc.) and which will trigger each time the code element is encountered, a bookmark is bound to a particular execution time point. Thus, a bookmark enables a user to quickly jump to different times in the prior execution on application 104f For example, a user might define one or more of execution time points 303a-303d in timeline 301 as bookmarks and jump to those points on timeline 301 using these bookmarks. Like breakpoint/watchpoints, bookmarks are typically user-defined (e.g., using a debugger user interface), but they could also be automatically (or semi-automatically) generated (e.g., based on a search or query at the debugger).
The historical state component 205 uses trace data to identify historical state of one or more code element that are part of a prior execution of an entity. This could include, for example, historical state of relevant code elements at times when replay is paused due to the breakpoint/watchpoint component 203 encountering a breakpoint or watchpoint, or due to the bookmark component 204 jumping to a bookmark. Unlike classic debuggers, which are generally are limited to using information that is gathered while the debugger executes application code “live,” the historical state component 205 can leverage the trace data (e.g., contained in trace data stream 302) to identify historical information about past and future values of a given code element at various points in a trace timeline (e.g., timeline 301).
In general, the visualization component 206 presents one or more debugging user interfaces for providing a debugging environment, such as presenting program code, facilitating use of watchpoints/breakpoints, presenting runtime state data, etc. However, in the particular context of the embodiments herein, the visualization component 206 can include unique functionality, enabled by the rich data contained in a time travel trace, for indicating the validity/confidence of presently presented data values, for presenting historical information about past and future values of a given code element (i.e., as identified by the historical state component 205), for providing functionality for setting and navigating bookmarks, etc. The visualization component 206 can present this information using a variety of visualization mechanisms.
Examples of a few possible visualizations mechanisms now given in connection with example user interfaces 400a-400h. In particular, these interfaces 400a-400h demonstrate examples of presenting historical state associated with code elements that are part of a prior execution of an entity and navigating time travel data, and for bookmark functionality. It is understood that user interfaces 400a-400h present only general principles for how various historical state data could be visualized in a user interfaces, and for how bookmark functionality could be implemented, and embodiments are not limited to the particular layouts and implementations shown in user interfaces 400a-400h.
Initially,
Referring to
Next,
As shown, execution time point 303c is subsequent to execution time point 303b in trace data stream 302 (which corresponds to line 43 as described in connection with
While
In addition to presenting current and last known values of a code element, as demonstrated in
In some embodiments, these memory access indicators could provide visual cues about the memory access. In the depicted example, for instance, a shaded diamond could represent a memory write, while an unshaded diamond could represent a memory read. It will be appreciated that visual cues could include any combination of memory access indictor shape, color, shading, size, animation, etc. These visual cues could present a many types of information, such as whether the memory access was a read or a write (as demonstrated in the example), whether a value of a memory location changed as a result of the memory access, a number of times a particular memory location has been accessed (e.g., making the indicator bolder or a deeper color with each access), a frequency of access to a particular memory location (e.g., changing a boldness or a deepness of color as access frequency changes), whether the memory access corresponds to triggering of a breakpoint or a watchpoint, a code structure with which the memory access is associated (e.g., different colors for different functions, classes, etc.), and the like.
In addition, some embodiments could present value(s) of the subject code element in connection with user interaction with the memory access indicators shown in swimlane 410a. For example,
As shown in
The history view 410 need not be limited to presenting historical state data for a single code element. For example,
As mentioned, the debugger 104c can include bookmarking functionality (i.e., bookmark component 204). In some embodiments, bookmark functionality could be implemented as part of the history view 410 and/or the code pane 402. For example,
Given sufficient richness of trace data, the history view 410 could even represent one or more time periods for which the value(s) of a code element are known to be valid. For example, while the value(s) of a code element are known to be valid at the point of a memory access, some traces could include additional data, such as CCP data and/or cache eviction data, that can be used to determine time period(s) over which these value(s) are known to have remained in a cache line of the cache(s) 102b unchanged. As such, in connection with the memory access indicators, the history view 410 could visually indicate time periods during which it is known for certain that the value is still valid. This could be presented, for example, by way of visual highlights (e.g., shading, colors, etc.) along a swimlane/timeline over which the value associated with a memory access indicator remains valid.
In view of the discussion above of different tracing techniques, it will be appreciated that a trace data stream (e.g., trace data stream 302) could lack trace data for one or more time periods. This could be, for example, because the trace data store 104d stored trace data into a ring buffer, and/or because tracing was disabled for the subject thread during one or more time periods. In some embodiments, the debugger 104c could provide visual indications of these missing periods of trace data. For example, the tools pane 404 could grey out execution time periods that lack trace data in the history view 410 and/or the time scale 416. In another example, the code pane 302 could grey out or deemphasize program code during periods during which its execution is not available in the trace data.
Some embodiments may leverage memory snapshots to enhance the ability of the debugger to provide last and next known values. For example, in some embodiments the tracer 104a could take a memory snapshot before and/or after one or more function calls. In other embodiments, the indexer 104b could add a memory snapshot before and/or after one or more function calls. This enables the debugger 104c to present last/next known values at the beginning and/or end of a function call, even if the code element(s) that are part of the function were not actually involved in a memory access during a particular instance of the function.
In some embodiments, a subject code element is a “property setter” and/or a “property getter” function. As will be appreciated, such functions may be used to set and/or obtain various data values. Such functions may actually “cover” a variety of memory locations. For example, execution of a property getter may actually involve execution one or more conditions (which could be based on a variety of data), and/or execution of a property getter may obtain values from a variety of memory locations (e.g., depending on the outcome of a condition). As such, some embodiments are aware of the memory locations that are “covered” by a property setter and/or property getter. In some embodiments, the debugger 104c may treat these “covered” memory locations as “watched” memory locations, and a change in the value a “watched” memory location may be used to alter the historical state of the function (e.g., its return value). In some embodiments, the debugger 104c may even present state of a “watched” memory location, including whether or not it is a last known value.
At times, data upon which a property getter function relies may not be “currently known” at the same time a “currently known” value being returned by the property getter function is being presented. For example, if the subject code element for which a current value is being presented at execution time point 303b is a property getter function, this property getter could rely on values that are known prior and/or subsequent to execution time point 303b. In these cases, the debugger 104 could present these values (e.g., in state pane 403), along with an indication if they are “last known” or “next known” relative to the current execution time point. Additionally, or alternatively, the debugger 104 could visualize the relative timing memory accesses corresponding to these values (e.g., in tools pane 440). For example, the debugger 104 could present in the history view 410 a first memory access indictor corresponding to the value returned by the property getter function, as well as one or more other memory access indictors corresponding to value(s) upon which the property getter function relies. The history view 410 can present these other memory access indicator(s) along the timeline of the history view 410 relative to first memory access indicator.
One or more of the value(s) upon which the property getter function relies may, themselves, be property getter functions. These “nested” property getter functions may, or may not, have executed during the runtime of the traced entity. In cases in which a nested property getter function did not actually execute during the original runtime of the traced entity, the debugger 104c could “virtually” execute it, in reliance on the traced data, in order to obtain its return value. The debugger 104c could even visually present the relative timing at which values upon which these nested property getter functions were known.
In some embodiments, a subject code element could be a data structure that, itself, contains a plurality of data elements (e.g., an array, a linked list, etc.). Similar to property getter functions, some of these contained data elements may not be “currently known” at the current execution time point in the debugger. As such, embodiments could treat these values in much the same manner as the property getter functions—i.e., individual values could be presented along with “next known” or “currently known” indicators and/or the values could be presented along a timeline.
In view of the components and data of computing environment 100 of
As shown, method 500 can include an act 501 of replaying prior execution of an entity based on one or more trace data streams. In some embodiments, act 501 comprises replaying one or more segments of the prior execution of the entity based on one or more trace data streams storing a trace of the prior execution of the entity. For example, the replay component 202 of debugger 104c/200 can replay one or more segments of trace data stream 302, with segments being be bounded in some implementations by key frames (e.g., 305a-305e), in order to replay at least a portion of a prior execution of an entity such as application 104f.
Method 500 can also include an act 502 of, based on the replay, presenting historical state data for a code element. In some embodiments, act 502 comprises, based on replaying the one or more segments of the prior execution of the entity, presenting historical state associated with the code element. For example, the historical state component 206 could identify historical state of code element(s) of application 104f, and the visualization component 206 could present this historical state, such as in one or more of the manners shown in connection with
As such, in particular implementations, method 500 could also include one or more of: (i) an act 503 of, at a current execution time point, presenting state of the code element based on a first memory access at the current execution time point; (ii) an act 504 of, at a different execution time point, presenting the state data of the code element with an indication that the state is a last known state or a next known state; or (iii) an act 505 of presenting a history view with indications of state of the code element at the current execution time point, at a prior execution time point, and at a future execution time point. It will be appreciated that method 500 could include each of acts 503-505, or a subset thereof. It will also be appreciated that the presentations of acts 503-505 need not be presented concurrently; indeed, these presentations could be presented serially, one of these presentations could cause another of these presentations to be displayed, one of these presentations could replace another of these presentations, one of these presentations could update another of these presentations, etc. Additionally, method 500 is not limited to the presentations of acts 503-505.
In some embodiments, act 503 comprises, in connection with a first execution time point in the prior execution of the entity, presenting at a user interface a first state of the code element, the first state of the code element being based on a first memory access associated with the code element at the first execution time point. For example, as shown in
In some embodiments, act 504 comprises, in connection with a different execution time point in the prior execution of the entity, presenting at the user interface the first state of the code element along with an indication that the first state of the code element is at least one of: (i) a last known state, wherein the indication is presented based on the different execution time point being after the first execution time point, but prior to a second memory access associated with the code element at a second execution time point in the prior execution of the entity; or (ii) a next known state, wherein the indication is presented based on the different execution time point being prior to the first execution time point, but after a third memory access associated with the code element at a third execution time point in the prior execution of the entity.
As an example of a next known state,
As an example of a last known state, in connection with reversing to an execution time that falls between execution time point 303a and execution time point 303b, the visualization component 206 could present a state pane 403 that shows the values of integer array 406 (i.e., 403a and 403b), along with one or more indicators that the value(s) shown are next known values. While the values displayed may be based on the memory read 304b in trace data stream 302 at execution time point 303b, the indicator(s) that the value(s) shown are last known values could be shown because the current execution time point is prior to execution time point 303b, but after to execution time point 303a when the prior memory access 304a occurred.
In some embodiments, act 505 comprises presenting, at the user interface, a history view that includes (i) a first indication that the first state of the code element exists at the first execution time point in the prior execution of the entity; (ii) a second indication that second state of the code element exists at the second execution time point in the prior execution of the entity, the second state of the code element being based on the second memory access associated with the code element at the second execution time point in the prior execution of the entity; and (iii) a third indication that third state of the code element exists at a third execution time point in the prior execution of the entity, the third state of the code element being based on the third memory access associated with the code element at the third execution time point in the prior execution of the entity. For example, as shown in
In some embodiments, the history view 410 is presented in act 505 based on a user interaction associated with the indication that the first state of the code element is the last known state or the next known state. For example, as discussed in connection with
In some embodiments, presentation of the history view 410 in act 505 can include one or more of (i) presenting the first state of the code element, based on a first interaction with the first indication, (ii) presenting the second state of the code element, based on a second interaction with the second indication or (iii) presenting the third state of the code element, based on a third interaction with the third indication. For example, as discussed in connection with
In some embodiments, presentation of the history view 410 in act 505 includes presenting a timeline view that includes an indication of a current execution time point. For example, for example, as discussed in connection with
In some embodiments, presentation of the history view 410 in act 505 can include one or more of (i) setting a current execution time point to the first execution time point, based on a first interaction with the first indication, (ii) setting the current execution time point to the second execution time point, based on a second interaction with the second indication; or (iii) setting the current execution time point to the third execution time point, based on a third interaction with the third indication. As mentioned, as discussed in connection with
In some embodiments, presentation of the history view 410 in act 505 includes presenting relative execution times for the first execution time point, the second execution time point, and the third execution time point. For example, as mentioned in connection with
In some embodiments, presentation of the history view 410 in act 505 includes presenting a plurality of timelines, a first timeline presenting a history of the code element, and a second timeline presenting a history of another code element. For example, as discussed in connection with
In some embodiments, presentation of the history view 410 in act 505 includes presenting a bookmark which, when selected, sets a current execution time point to an execution time point corresponding to the bookmark. For example, as discussed in connection with
As mentioned, the subject code element could comprise any type of code element, such as data or code itself. For example, in method 500 the code element could be a variable or a data structure. In this situation, the first state of the code element could comprise one or more values stored memory (e.g., cache(s) 102b) backing the variable or the data structure. In another example, the code element could be a property getter function. In this situation, the first state of the code element could one or more values returned by the property getter function. In embodiments in which the code element is a property getter function, method 500 could include an interaction with the code element to result in the user interface causing a watched variable, other than the code element itself (e.g., a value “covered” by the property getter function), to reflect a last known value.
In addition, as discussed, a property getter function may rely upon value(s) of other code element(s), and one or more of these values may actually be known at the first execution time point. In these cases, method 500 could further comprise visualizing the value(s) of these code element(s) relative to the first execution time point. For example, the debugger 104c could present the value(s) of these code element(s) on a timeline relative to the value of the subject code element and/or could indicate that the value of code element(s) are last known or next known at the first execution time point. As discussed, one or more of these other code element(s) could, themselves, be “nested” property getter functions (which may, or may not, have executed during the traced runtimes). In these cases, method 500 could further comprise virtually executing a “nested” property getter function, particularly if it did not execute during the traced runtime.
Accordingly, embodiments herein leverage the wealth of information recorded in bit-accurate time travel traces to provide rich debugging experiences, including providing one or more visualizations of historical state associated with code element(s) that are part of a prior execution of an entity. As such, embodiments provide a richness of data not available in prior forms of debugging that can greatly enhance the ability of a debugger to present the operation of program code, which, in turn, can dramatically decrease the amount of time it takes to debug code. Reducing the amount of time taken to debug code directly reduces an amount of computing resources that are consumed in the debugging process.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the described features or acts described above, or the order of the acts described above. Rather, the described features and acts are disclosed as example forms of implementing the claims.
The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.