The present invention relates to the field of diagnostic recording and, more particularly, to a method and apparatus for diagnostic recording using transactional memory.
The ability to collect data about a software application's execution as it runs is often referred to as a “flight recorder.” The cost of a software application flight recorder, though, is frequently thought of as being cost prohibitive, involving a significant amount of overhead. More specifically, a flight recorder in software is typically built as an in-memory circular buffer that stores frequent activity with very little overhead. Even though there is very little overhead, tracking every read/write into such a buffer would, despite being very useful when a problem occurs, be too expensive and would likely reduce performance by 50% or more. There is usually no “log” and no “transactions” when using a flight recorder. A flight recorder is very primitive. The buffer could be dumped out on demand or when a problem occurs. Building a flight recorder into hardware for the sole purposes of diagnostics is unfortunately not a high enough justification of overhead and hardware enhancements.
However, the flight recorder does allow, when there is a problem, the trace file to be played back for root cause analysis. Thus, there is often no need to reproduce the problem. In many causes, the source of the problem can be determined from a single occurrence. Transactional memory has been used for other purposes, but not for diagnostic recording.
Hardware-based transactional memory can provide a very useful flight recorder with very little additional investment. Embodiments herein can provide a mechanism to handle a memory exception (or trap) and treat this as similar to a violation. Transactional memory can provide a means to access a memory transaction log for a thread and dump such information as needed to serve as a flight recorder of recent activity. The transaction log can be stored in a core file or other file, and the steps leading up to the exception event or trap can be replayed post-mortem inside the debugger.
The embodiments of the present invention can be implemented in accordance with numerous aspects consistent with the material presented herein. For example, one aspect of the present invention can include diagnostic recording method using (software or hardware-based) transactional memory including the steps of storing a transaction log of the transactional memory (of most recent memory accesses for example), detecting an exception event, and replaying last instructions that led up to the exception event using a debugger tool. The step of storing the transaction log can include storing the transaction log in a core file.
Another aspect of the present invention can include a diagnostic recording device such as a diagnostic flight recorder having transactional memory and a processor coupled to the transactional memory (of most recent memory accesses for example). The processor can be operable to store contents of a transaction log of the transactional memory, detect an exception event, and replay last instructions that led up to the exception event using a debugger tool. As noted above, the transactional memory can be hardware-based transactional memory or software-based transactional memory. The processor can also be operable to store the contents of the transaction log by storing contents of the transaction log in a file such as a core file. The core file can include a stack, a register dump, a memory dump, and the transactional log. The processor can be further operable to make a special system call into an operating system to provide call thread access to its own transaction log. The processor can be further operable to cause the debugger tool to load up the core file, an executable file, and a library to enable the diagnostic recording device to retrace transactions occurring at the diagnostic recording device up to the exception event. The exception event is also known as a trap or trap condition, fault, program check, or violation.
It should be noted that various aspects of the invention can be implemented as a program or a computer implemented method for controlling computing equipment to implement the functions described herein, or a program for enabling computing equipment to perform processes corresponding to the steps disclosed herein. This program may be provided by storing the program in a magnetic disk, an optical disk, a semiconductor memory, any other recording medium, or can also be provided as a digitally encoded signal conveyed via a carrier wave. The described program can be a single program or can be implemented as multiple subprograms, each of which interact within a single computing device or interact in a distributed fashion across a network space.
There are shown in the drawings, embodiments which are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.
Embodiments herein can include (but is not limited to) a hardware based transactional memory technology that keeps a log of recent memory accesses in a hardware buffer. When a hardware exception (trap, fault, violation, etc) occurs, this log can be dumped to a file (or as part of a file such as a core file) and used with a debugger for post mortem analysis. The last instructions that lead up to exception would be recorded and replayed with a debugger using the transaction log as a reference. Note, replaying can be done by a tool or by a human using a debugger or a debugger interface.
The idea is an extension of proposed hardware-based transactional memory models. The log that is required for hardware-based transaction memory can be dumped when a failure occurs and used for debugging purposes (e.g. via a debugger) to replay the last instructions that led up to the exception. The cause, although often obfuscated by complexity and overwritten data, can be captured between the transaction log and the virtual memory image for the process or operating system.
Referring to
Referring to the thread 200 of
As illustrated in
Referring to
The present invention may be realized in hardware, software, or a combination of hardware and software. The present invention may be realized in a centralized fashion in one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software may be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
The present invention also may be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
This invention may be embodied in other forms without departing from the spirit or essential attributes thereof. Accordingly, reference should be made to the following claims, rather than to the foregoing specification, as indicating the scope of the invention.