I. Field
The present disclosure relates generally to the field of integrated circuits and, more specifically, to techniques for debugging a programmable integrated circuit.
II. Background
Increasing complexity of programmable integrated circuits used in devices performing computationally intensive data processing—for example, devices for mobile or wired communications, graphics processors, microprocessors, and the like—creates a need for the development of sophisticated embedded (i.e., on-chip, or in-silicon) test systems adapted for in-situ debugging of such integrated circuits.
Conventional on-chip test systems utilize circuit-specific test architectures (such as scan-chain test architectures) that can consume significant portions of a chip's real estate. Such systems often lack flexibility in accommodating design modifications.
Techniques for debugging a programmable integrated circuit are described herein. In an embodiment, an off-chip computer executing a test program initiates pre-determined instruction-cache-misses on an integrated circuit running an application program. During an instruction-cache-miss, the off-chip computer substitutes instructions of the application program with test instructions contained in the test program. Responses of the integrated circuit to the test instructions are analyzed, and results of the analysis are used to debug the integrated circuit.
In exemplary designs, the inventive techniques are used for debugging processors and graphics processors of wireless or wired communication system-on-chip devices, among other programmable integrated circuits.
Various aspects and embodiments of the invention are described in further detail below.
The Summary is neither intended nor should it be construed as being representative of the full extent and scope of the present invention, which these and additional aspects will become more readily apparent from the detailed description, particularly when taken together with the appended drawings.
The images in the drawings are simplified for illustrative purposes and are not depicted to scale. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures, except that suffixes may be added, when appropriate, to differentiate such elements.
The appended drawings illustrate exemplary embodiments of the invention and, as such, should not be considered as limiting the scope of the invention that may admit to other equally effective embodiments. It is contemplated that features or steps of one embodiment may be beneficially incorporated in other embodiments without further recitation.
The term “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any embodiment or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or designs.
Referring to the figures,
The system 100A illustratively comprises a system-on-chip (SOC) device 110 including the integrated circuit 120A, optional on-chip and off-chip devices 134 and 150 (for example, integrated circuit devices), and optional on-chip and off-chip electronic memories 136 and 140 that are interconnected using a system bus 111 of the SOC device, and a test computer 160.
In exemplary applications, the integrated circuit 120A may be a portion of the SOC device 110 used in an apparatus for wireless or wired communications, processing video data, rendering graphics, among other apparatuses performing computationally intensive data processing, such as processors, graphics processors, and the like. In some embodiments, such an integrated circuit is a Q-shader graphics processing unit (GPU) of video data processing system, which salient features are discussed below in reference to
Generally, the integrated circuit 120A comprises a processing core 122, a program controller 124, a memory unit 126 including a program memory 128, an instruction cache 132, and a gate module 130 adapted to generate “instruction cache miss” events or instruction-cache-misses in the integrated circuit.
Typically, data/command exchanges between the processing core 122, program controller 124, and memory unit 126 are performed via an internal system bus 131 of the integrated circuit 120A. However, other interfacing schemes (not shown) have been contemplated for the integrated circuit 120A and are within the scope of the present invention (for example, the program controller 124 or the memory unit 126 may directly be coupled to the processing core 122).
In operation, portions of program instructions of a respective application program are downloaded, in a pre-determined order, from the program memory 128 to the instruction cache 132. From the instruction cache 132, the program instructions are sequentially forwarded, via the gate module 130, to the program controller 124 that administers and monitors execution of the instructions by the processing core 122. In an alternate embodiment (shown in phantom in
The test computer 160 is connected to the integrated circuit 120A using interfaces 161 and 163 coupled to the gate module 130 and the system bus 131, respectively. In one embodiment, the interface 161 is used for transmitting requests for generating the “instruction cache miss” events in the integrated circuit 120A, and the interface 163 is used to monitor data processing in the integrated circuit 120A and perform debugging of the integrated circuit.
In the depicted embodiment, the program instructions of an application program running by the integrated circuit 120A are transmitted, via interface 135A, from the instruction cache 132 to the gate module 130. From the gate module 130, via interface 137A, these instructions are forwarded to the program controller 124. Upon a request initiated, via the interface 161, by the test computer 160, the gate module 130 may interrupt the flow of the program instructions from the instruction cache 132 to the program controller 124, thus generating an “instruction cache miss” event in the running application program.
During the “instruction cache miss” event, substitute instructions (for example, test instructions) may be provided, via a branch 165 of the interface 163, from the test computer 160 to the program controller 124. Alternatively, the substitute instructions may be provided to the program controller 124 via a branch 167 coupling the interfaces 163 and 131. Accordingly, execution of the respective application program or test instructions by the integrated circuit 120A may be monitored by the test computer 160 via the branch 167 or, alternatively, using an off-chip link 169 (shown with broken line), which couples the test computer to the system bus 111 of the SOC 110.
Referring to
Referring to
Together, the gate module 130 and interfaces 161, 163 form a test channel for debugging the integrated circuits 120A, 120B and 120C (hereinafter “integrated circuit 120”) in the respective embodiments. Such a test channel occupies a small area of the chip and is broadly insensitive to particular architecture and/or design characteristics of the integrated circuit 120 or the SOC 110. The gate module 130 and the respective interfaces may be fabricated simultaneously with other elements of the integrated circuit 120 or the SOC 110. The test computer 160 may be coupled to the test channel using conventional electrical couplers, such as contact pads, contact pins, connectors, and the like.
At step 210, an application program is loaded in the memory unit 126 and/or activated in the integrated circuit 120. Program instructions of the running application program are fetched, in a pre-determined order, from the program memory 128 in the instruction cache 132. From the instruction cache 132, via the gate module 130, the instructions are sequentially forwarded to the program controller 124.
At step 220, a pre-determined test program adapted for debugging the integrated circuit 120 and, optionally, the application program, is activated on the test computer 160. In one embodiment, the test program contains instructions (i.e., test instructions) that allow the test computer 160 to monitor program flow in the integrated circuit 120 and selectively initiate requests for “instruction cache miss” events at pre-determined steps of the running application program. In particular, during monitoring of execution of the application program, the test program may allow to the test computer 160 to monitor contents of internal registers of the integrated circuit 120 or memory cells of the memory unit 126.
In one embodiment, the test program and the test instructions are stored in a memory of the test computer 160. In alternate embodiments, these instructions or at least a portion of the test program may be stored in the memory unit 126, the memories 136 or 140, or memories (not shown) of the devices 134 or 150.
At step 230, at pre-determined steps in the test program or the application program, the test computer 160 initiates requests for the “instruction cache miss” events in the application program running in the integrated circuit 120. In one embodiment, such requests may be initiated based on analysis of information collected via monitoring the program flow or data processing in the integrated circuit 120.
The requests are forwarded, via the interface 161, to the gate module 130. In response, the gate module 130 generates the “instruction cache miss” events in the integrated circuit 120. Specifically, in response to each request, transmission of the instructions of the application program from the instruction cache 132 is terminated, and a program break point is set at a pre-determined step of the application program.
At step 240, during the “instruction cache miss” event, one or more instructions are sequentially stuffed, via the interface 163, from the test computer 160 in the program controller 124 for execution by the processing core 122. As such, during step 240, the application program's instructions remaining in the instruction cache 132 or the program memory 128 are substituted with the test instructions contained in the test program running on the test computer 160.
In particular, these test instructions may allow the test computer 160 to selectively monitor, modify, or replace, at a run time of the application program, contents of internal registers of the integrated circuit 120 or memory cells of the memory unit 126. In a further embodiment, the test instructions may allow to simulate pre-determined critical conditions or events in hardware or software elements of the integrated circuit 120.
At step 250, the test computer 160 monitors, via the interface 163, responses of the integrated circuit 120 to the test instructions fetched in the program controller 124 during the respective “instruction cache miss” event. For example, the test computer 160 may monitor contents of the internal registers or the memory cells of the integrated circuit 120 and compare the collected information with pre-calculated data contained in the test program.
Upon execution of the test instruction stuffed in the program controller 124 during a particular “instruction cache miss” event, transmission of the application program instructions from the instruction cache 132 to the program controller 124 is restored. In one embodiment, after the “instruction cache miss” event, the application program may be executed starting from a program step substituted by the respective “instruction cache miss” event. In another embodiment, the application program may be executed starting from a program step next to the program step substituted by the “instruction cache miss” event or, alternatively, from a program step specified in the test instructions.
At step 260, the test computer 160 analyses responses of the integrated circuit 120 to the test instructions provided during the “instruction cache miss” events to determine errors, if any, in execution of data processing operations by components of the integrated circuit 120. Then, based on these results, the integrated circuit 120 may be debugged using the test computer 160 or, alternatively, other remote processor. In one embodiment, results of a debugging process may be used to correct in-situ the identified error(s). Such debugging may be performed in real time (for example, during the “instruction cache miss” events) or, alternatively, upon completion of the application program. In a further embodiment, the results of such analysis may also be used for debugging the application program.
In exemplary embodiments, the method 200 may be implemented in hardware, software, firmware, or any combination thereof in a form of a computer program product comprising one or more computer-executable instructions. When implemented in software, the computer program product may be stored on or transmitted using a computer-readable medium, which includes computer storage medium and computer communication medium.
The term “computer storage medium” refers herein to any medium adapted for storing the instructions that cause the computer to execute the method. By way of example, and not limitation, the computer storage medium may comprise solid-sate memory devices, including electronic memory devices (e.g., RAM, ROM, EEPROM, and the like), optical memory devices (e.g., compact discs (CD), digital versatile discs (DVD), and the like), or magnetic memory devices (e.g., hard drives, flash drives, tape drives, and the like), or other memory devices adapted to store the computer program product, or a combination of such memory devices.
The term “computer communication medium” refers herein to any physical interface adapted to transmit the computer program product from one place to another using for example, a modulated carrier wave, an optical signal, a DC or AC current, and the like means. By way of example, and not limitation, the computer communication medium may comprise twisted wire pairs, printed or flat cables, coaxial cables, fiber-optic cables, digital subscriber lines (DSL), or other wired, wireless, or optical serial or parallel interfaces, or a combination thereof.
The Q-shader GPU 302 may be compliant, for example, with a document “OpenVG Specification, Version 1.0,” Jul. 28, 2005, which is publicly available. This document is a standard for 2D vector graphics suitable for handheld and mobile devices, such as cellular phones and other referred to above wireless communication apparatuses. Additionally, the Q-shader GPU 302 may also be compliant with OpenGL2.0, OpenGL ES2.0, or D3D9.0 graphics standards.
In operation, each graphics application 315 (for example, video game or video conferencing, among other video applications) generates high-level commands that are communicated, via the API 320, to the driver/compiler 330. The driver/compiler 330 converts these high-level commands in individual application sub-programs, which are executed by the Q-shader GPU 302. In the Q-shader GPU 302, execution of the application sub-programs may be performed sequentially or, alternatively, concurrently.
Referring back to
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.