A processing system may need a trace system to reproduce executed program instructions. A trace system may generate a trace for a segment of executed program instructions, and store the trace in a trace buffer. In some processing systems, however, the trace buffer may have limited space to store traces. This may result in frequent processing interruptions to empty the trace buffer.
The subject matter regarded as the embodiments is particularly pointed out and distinctly claimed in the concluding portion of the specification. The embodiments, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
Numerous specific details may be set forth herein to provide a thorough understanding of the embodiments. It will be understood by those skilled in the art, however, that the embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the embodiments. It can be appreciated that the specific structural and functional details disclosed herein may be representative and do not necessarily limit the scope of the embodiments.
It is worthy to note that any reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Referring now in detail to the drawings wherein like parts are designated by like reference numerals throughout, there is illustrated in
System 100 may comprise a plurality of nodes connected by varying types of communications media. The term “communications media” as used herein may refer to any medium capable of carrying information signals. Examples of communications media may include metal leads, semiconductor material, twisted-pair wire, co-axial cable, fiber optic, radio frequency (RF) spectrum, and so forth. The terms “connection” or “interconnection,” and variations thereof, in this context may refer to physical connections and/or logical connections. The nodes may connect to the communications media using one or more input/output (I/O) adapters, such as a network interface card (NIC), for example. An I/O adapter may be configured to operate with any suitable technique for controlling communication signals between computer or network devices using a desired set of communications protocols, services and operating procedures, for example. The I/O adapter may also include the appropriate physical connectors to connect the I/O adapter with a suitable communications medium.
In one embodiment, for example, system 100 may be implemented as a wireless system having a plurality of nodes using RF spectrum to communicate information, such as a cellular or mobile system. In this case, one or more nodes shown in system 100 may further comprise the appropriate devices and interfaces to communicate information signals over the designated RF spectrum. Examples of such devices and interfaces may include omni-directional antennas and wireless RF transceivers. The embodiments are not limited in this context.
The nodes of system 100 may be configured to communicate different types of information. For example, one type of information may comprise “media information.” Media information may refer to any data representing content meant for a user, such as data from a voice conversation, videoconference, streaming video, electronic mail (“email”) message, voice mail message, alphanumeric symbols, graphics, image, video, text and so forth. Another type of information may comprise “control information.” Control information may refer to any data representing commands, instructions or control words meant for an automated system. For example, control information may be used to route media information through a system, or instruct a node to process the media information in a predetermined manner. The embodiments are not limited in this context.
The nodes of system 100 may communicate the media or control information in accordance with one or more protocols. The term “protocol” as used herein may refer to a set of instructions to control how the information is communicated over the communications medium. The protocol may be defined by one or more protocol standards, such as the standards promulgated by the Internet Engineering Task Force (IETF), International Telecommunications Union (ITU), a company such as Intel® Corporation, and so forth.
As shown in
In general operation, wireless nodes 102 and 104 may execute one or more sets of program instructions. The term “program instructions” may include computer code segments comprising words, values and symbols from a predefined computer language that, when placed in combination according to a predefined manner or syntax, cause a processor to perform a certain function. Examples of a computer language may include C, C++, JAVA, assembly and so forth. Consequently, wireless nodes 102 and/or 104 may need a trace system to reproduce program instructions executed by a processor for wireless nodes 102 and/or 104. A trace system for wireless nodes 102 and 104 may be discussed in more detail with reference to
As shown in
In one embodiment, trace system 200 may comprise processor 202. Processor 202 can be any type of processor capable of providing the speed and functionality required by the embodiments. For example, processor 202 could be a processor made by Intel® Corporation and others. Processor 202 may also comprise a digital signal processor (DSP) and accompanying architecture, such as a DSP from Texas Instruments Incorporated. Processor 202 may further comprise a dedicated processor such as a network processor, embedded processor, micro-controller, input/output (I/O) processor, controller and so forth. The embodiments are not limited in this context.
In one embodiment, trace system 200 may comprise memory 210. Memory 210 may comprise a machine-readable medium and may include any medium capable of storing instructions adapted to be executed by a processor. Some examples of such media include, but are not limited to, read-only memory (ROM), random-access memory (RAM), programmable ROM, erasable programmable ROM, electronically erasable programmable ROM, dynamic RAM, magnetic disk (e.g., floppy disk and hard drive), optical disk (e.g., CD-ROM) and any other media that may store digital information. In one embodiment, the instructions are stored on the medium in a compressed and/or encrypted format. As used herein, the phrase “adapted to be executed by a processor” is meant to encompass instructions stored in a compressed and/or encrypted format, as well as instructions that have to be compiled or installed by an installer before being executed by the processor. Further, processor 202 may access various combinations of machine-readable storage devices which are capable of storing a combination of computer program instructions and data through various I/O controllers (not shown). The embodiments are not limited in this context.
In one embodiment, trace system 200 may comprise communication bus 208. Communication bus 208 may be any communication bus suitable for communicating information between elements 202, 204 and 206 at the operating speed identified for a given implementation. The embodiments are not limited in this context.
In one embodiment, trace system 200 may comprise TMM 206. TMM 206 may manage trace operations for trace system 200. For example, TMM 206 may operate to generate traces. A “trace” as used herein may refer to a data structure containing information to reproduce executed program instructions. TMM 206 may also store and retrieve traces from trace buffer 204. In addition, TMM 206 may reproduce program instructions executed by processor 202 using a trace. TMM 206 may be discussed in more detail with reference to
In one embodiment, system 200 may comprise trace buffer 204. Trace buffer 204 may be used to store traces generated by TMM 206. Trace buffer 204 may be a hardware or software buffer, depending on the architecture of wireless nodes 102 and 104, as well as available system resources. Trace buffer 204 will typically have a finite length. In one embodiment, for example, trace buffer 204 may be a hardware trace buffer storing N trace entries. A typical number of trace entries may comprise N=16, for example, although the embodiments are not limited in this context.
In one embodiment, TMM 300 may comprise trace interrupt module 308. Trace interrupt module 308 may be configured to remove traces from trace buffer 204. Trace interrupt module 308 may provide an interrupt or exception event to remove traces from trace buffer 204. Trace interrupt module 308 may remove traces from trace buffer 204 under a number of different conditions. In one embodiment, for example, trace interrupt module 308 may receive a request from an external source to retrieve one or more traces from trace buffer 204. The external source may comprise an application program, operating system, or component of TMM 300. In one embodiment, for example, trace interrupt module 308 may be configured to remove one or more traces on a periodic basis, such as every M number of processing cycles, clock cycles, and so forth. In one embodiment, for example, trace interrupt module 308 may be configured to monitor trace buffer 204, and remove or “dump” stored traces when trace buffer 204 becomes full or reaches an overflow condition. Trace interrupt module 308 may copy or move traces from trace buffer 204 to some other storage location, such as memory 210, for example. Alternatively, trace interrupt module 308 may be configured to stream out the contents of trace buffer 204 in real-time, which is possible since the traces may be highly compact relative to conventional traces. In this case, the metric may be a value for any of the performance counters present in system 100 or system 200, such as cycle count, branch count, and so forth. The embodiments are not limited in this context.
In one embodiment, TMM 300 may comprise trace generator 302. Trace generator 302 may be configured to generate traces. More particularly, trace generator 302 may generate a trace in a manner that reduces the amount of information needed to be stored in trace buffer 204 as compared to conventional trace generating techniques.
Conventional trace generating techniques may be unsatisfactory for a number of reasons. For example, conventional trace generating techniques typically record program flow by recording discontinuous addresses that occur as a result of changes in the program flow. These changes may be caused by a number of different types of program instructions, such as program call instructions, jump instructions, conditional branch instructions, unconditional branch instructions, interrupts, and so forth. Conventional techniques may record a trace for every discontinuity in a set or subset of program instructions, which may quickly fill up the trace buffer. This may lead to a relatively frequent number of interrupts needed to remove or dump traces from the trace buffer to some other storage location.
Trace generator 302 may solve these and other problems. Trace generator 302 may generate a trace for a subset of program instructions formed from the overall set of program instructions. An example of a set of program instructions may comprise an application program. An example of a subset of program instructions may comprise a function of an application program. In this manner, the number of traces recorded in trace buffer 204 may be proportional to the number of function calls in the program as opposed to the total number of discontinuities in the program. As a result, the number of traces needed for a given set of program instructions may be significantly less than conventional techniques, which in some instances may be several orders of magnitude. For example, conventional trace generating techniques may generate as many as approximately 1.5 million traces for the G723 protocol. By way of contrast, trace generator 302 may generate approximately 62,000 traces for the same G723 protocol. These numbers are by way of example only, and may vary for a given implementation.
In one embodiment, trace generator 302 may further comprise a path identification generator (PIDG) 304, a path identification register (PIDR) 306, and a program instruction register (PISG) 308. These elements may be discussed in more detail with reference to
In one embodiment, TMM 300 may comprise trace decoder 310. Trace decoder 310 may be configured to reproduce executed program instructions using a trace. Trace decoder 310 may retrieve a trace from trace buffer 204 or memory 210. Trace decoder 310 may decode the executed program instructions for a function using the trace.
The operations of systems 100-300 may be further described with reference to
In one embodiment, the trace may be generated at block 402 by receiving an endpoint program instruction for the subset of program instructions. The endpoint program instruction may be any terminating instruction, such as a function return or function exit instruction. The path identifier value and end address for the subset of program instructions may be generated. A start address may be retrieved from a program counter register. The trace may be generated using the path identifier value, start address and end address.
In one embodiment, the path identifier value and end address may be generated by initializing a path identifier register. The path identifier register may be configured to store an end address and a path identifier value. Each unconditional branch instruction for the set of program instructions may be assigned an unconditional partial path value and an unconditional offset value. Each conditional branch instruction for the set of program instructions may be assigned a taken branch partial path value, an untaken branch partial path value, and a conditional offset value.
In one embodiment, the path identifier value and end address may be further generated by receiving a branch instruction. A determination may be made as to whether the branch instruction is a conditional branch instruction or unconditional branch instruction. If the branch instruction is an unconditional branch instruction, the path identifier value stored in the path identification register may be incremented with the unconditional partial path value, and the end address stored in the path identification register may be incremented with the unconditional offset value. If the branch instruction is a conditional branch instruction that was taken, the path identifier value may be incremented with the taken branch partial path value, and the end address may be incremented with the conditional offset value. If the branch instruction is a conditional branch instruction that was untaken, the path identifier value may be incremented with the untaken branch partial path value, and the end address may be incremented with the conditional offset value.
The operation of systems 100-300, and the programming logic shown in
The control graph shown in
Using the above algorithm, each edge of the control flow graph may be annotated with a partial path value as shown in
By way of example, the path “acdf” has a path identification value of 0. If you traverse path “acdf” through the control flow graph shown in
At least two significant principles are described above. The first is that given an acyclic control flow graph annotated with partial path values, the number of paths from A to EXIT is Num_Paths(a), and each path from A to EXIT generates a unique value sum in the range of 0 . . . Num_Paths(a)−1. The second is that given an acyclic control flow graph, the number of paths from node A to node B is Num_Paths(a,b), and each path from A to B generates a unique value sum in the range 0 . . . Num_Paths(a), where Num_Paths(a) is the number of paths from A to EXIT.
These two principles may be confirmed using the following proof. Assume there are two paths pa1 and pa2 from A to B which have the same value sum denoted by x. Select any path pb from B to EXIT whose value is y. Now the paths pa1+pb and pa2+pb will be unique paths from A to EXIT with same value x+y. This violates theorem 1 and hence the initial assumption that there exists two paths pa1 and pa2 between A and B having the same value is not true.
FIGS. 7A-C may illustrate a path identification register, unconditional branch instruction and conditional branch instruction in accordance with one embodiment. Once the partial path values are assigned to the control flow graph, a path identification value for a given path executed by processor 202 may be generated using PIDR 306.
It is worthy to note that the number of bits allocated to specifying the offsets that need to be added to path identification register 702 may determine the maximum path length that can be represented at a time. If this value is set to 1, then trace system 200 may be configured to default into recording every control flow transfer, similar to conventional trace techniques.
Path identification register 702 may be used as follows when a call instruction and return instruction are received. When a call instruction is received, path identification register 702 is treated similar to the return address register. Its value is saved on the call frame stack and is initialized to zero. The call frame stack may be a stack allocated by the compiler based upon a number of factors, such as the number of local variables, temporary locations needed by a function, and so forth. When a return instruction is received, path identification register 702 is written into trace buffer 204 and the previous value of the path identification value which corresponds to the partial path taken in the calling function is restored.
Once the unique path identification information is generated for an application program or function of an application program, the path identification information may be assigned in the branch instructions. For example, the first program described previously may be modified by the compiler as shown in the second program below:
As shown in the second program, the conditional branch instructions have been assigned two new values, the first being a partial path value for a taken branch, and the second being a partial path value for an untaken branch. The partial path values may be used to generate a path identification value for a path of executed program instructions.
As shown in
As shown in Table 2, trace buffer 204 may have multiple entries 1-N, with each entry having a path identification value, a start address, an end address, and a metric. The metric may comprise, for example, any desired metric to measure performance of system 100 or trace system 200, or to assist in troubleshooting either system. Examples of suitable metrics may comprise total number of processing cycles, branch instructions, taken branches, not taken branches, and so forth. The embodiments are not limited in this context.
Referring again to
Once the traces have been generated, trace decoder 310 may reproduce the complete control flow using the path identification value, start address and end address for each trace.
In one embodiment, trace system 200 may be modified to handle loops with backward branches. In this case, the conditional branch instruction may be enhanced to initialize the path identification value to zero, and make an entry into trace buffer 204 in addition to the ability to add a specified value to path identification register 702. This may be used for back edges of loops. It is worthy to note that there may be multiple entries for each iteration of the loop corresponding to the back edge, the complete path information inside the loop is represented as path identification values providing good compression. If there is only one path inside the loop then one entry may be made in trace buffer 204 rather than multiple entries. This may be detected by comparing the previous entry with the entry about to be written.
In one embodiment, trace system 200 may also be modified to use a check point instruction. The check point instruction may explicitly dump the contents of path identification register 702 into trace buffer 204 and initialize it to zero. This may be useful when the total number of paths in a function is extremely large and path identification register 702 may potentially overflow.
The embodiments may be implemented using an architecture that may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other performance constraints. For example, one embodiment may be implemented using software executed by a processor. The processor may be a general-purpose or dedicated processor, such as a processor made by Intel® Corporation, for example. The software may comprise computer program code segments, programming logic, instructions or data. The software may be stored on a medium accessible by a machine, computer or other processing system. Examples of acceptable mediums may include computer-readable mediums as previously described. In one embodiment, the medium may store programming instructions in a compressed and/or encrypted format, as well as instructions that may have to be compiled or installed by an installer before being executed by the processor. In another example, one embodiment may be implemented as dedicated hardware, such as an Application Specific Integrated Circuit (ASIC), Programmable Logic Device (PLD) or Digital Signal Processor (DSP) and accompanying hardware structures. In yet another example, one embodiment may be implemented by any combination of programmed general-purpose computer components and custom hardware components. The embodiments are not limited in this context.
While certain features of the embodiments have been illustrated as described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the embodiments.