The present invention relates generally to processing circuit architecture and, more particularly, to a pipeline architecture for a processing circuit having multiple modules connected in series to form a module pipeline.
A processing circuit for a mobile terminal or other device may be implemented as an Application Specific Integrated Circuit (ASIC), or Field Programmable Gate Array (FPGA) where different functions are implemented by different modules. Implementation of different functions in different modules enables one module to be updated or replaced without affecting the function of the other modules. Configuration data, status information, and other data used by a module to implement its assigned functions are stored in registers. The use of registers for storing configuration data enables the modules to operate in multiple modes and to perform multiple functions.
The organization of the registers in the processing circuit is one design consideration. One conventional approach to organizing the registers is to centralize all registers in a register unit. Each module interfaces directly to the matched registers. The register unit is responsible for decoding registers addresses and outputting stored values to corresponding modules.
This centralized approach has several disadvantages. For example, the centralized solution requires the register unit to decode all register addresses, which generally requires complex logic, and thus, leads to timing problems. Further, because the register unit is responsible for distributing all registers to the modules, it needs to interface with all of the modules. This one-to-many interface may lead to a routing jam at the register module group. In addition, this solution is hard to update. For example, if a new module is added or removed, the register unit and the corresponding logic have to be revised.
Another conventional approach to organizing the registers is to distribute the registers among modules connected to an internal register bus. In this approach, each module includes its own register group and decoder and is connected to an internal register bus. A bus converter provides an external interface to the register bus and converts the external interface protocol into the internal register bus protocol. All of the modules monitor the internal register bus simultaneously. When a register request is asserted, all of the modules decode a target register address associated with the register request. If the target register address specifies a register that belongs to the module, the module latches the register data into or reads register data from the specified register. All other modules do nothing.
While the internal bus structure eliminates the one-to-many interface and the update problems associated with the centralized register solution, the bus structure solution still encounters timing problems. In particular, as the number of modules interfacing with the internal register bus increases, the fan-out of the register bus is very high, which results in a large timing delay.
Thus, there remains a need for an improved processing circuit architecture that eliminates or reduces timing problems associated with the conventional approaches.
A processing circuit comprises a plurality of modules serially connected by a plurality of register bus segments to create a module pipeline. Each module comprises one or more registers and is assigned a corresponding address range. A register request, including a target register address, is passed from one module to the succeeding module down the module pipeline until the register request is received at the module containing the targeted register.
Exemplary embodiments of the invention comprise methods implemented by a processing module connected with a plurality of like modules in a module pipeline. In one exemplary method, a register request including a target register address is received over a first interface connected to a preceding module by a first segment of an internal register bus. The target register address is compared to an address range of the processing module. If the target register address falls within the address range of the processing module, a matching register in the processing module is accessed to write data to or read data from the matching register. If the target register address falls outside the processing module's register address range, the register request is output over a second interface to a succeeding module connected to the processing module by a second segment of the internal register bus.
Other embodiments of the invention comprise a processing module in a processing circuit connected to a plurality of like modules forming a module pipeline. One exemplary processing module comprises a first interface, a second interface, one or more registers for storing data, and a decoder. The first interface connects to a preceding module via a first segment of an internal register bus. The second interface connects to a succeeding module via a second segment of the internal register bus. The decoder is configured to receive a register request over the first interface and to compare a target register address associated with the register request to an address range for the processing module. If the target register address falls within the register address range, the decoder accesses a matching register in the processing module to write data into or read data from the matching register. If the target register address falls outside the address range of the processing module, the decoder outputs the register request to the succeeding module over the second interface.
Other embodiments of the invention comprise a method implemented by a processing circuit having a plurality of modules connected to form a module pipeline. In one exemplary method, a register request including a target register address is sequentially passed through the plurality of modules serially connected by an internal register bus. The internal register bus comprises a plurality of segments connecting adjacent modules. At each module receiving the register request, the target register address is compared to an address range of the receiving module. If the target register address falls within the address range of the receiving module, a matching register within the receiving module is accessed to write data into or read data from the matching register. If the target register address falls outside the address range of the receiving module and if there is a succeeding module, the register request is passed to the succeeding module over the internal register bus.
Other embodiments of the invention comprise a processing circuit with a pipeline architecture. In one embodiment, the processing circuit comprises a plurality of modules, an internal register bus having two or more segments connecting the plurality of modules in series to form a module pipeline, and a decoder for each module. Each module includes one or more registers. The internal register bus is configured to pass a register request including a target register address through the serially connected modules. The decoder for a receiving module is configured to compare the target register address to a register address range for the receiving module. If the target register address falls within the register address range of the receiving module, a matching register within the receiving module writes data into or read data from the matching register. If the target register address falls outside the register address range of the receiving module and if there is a succeeding module, the decoder passes the register request to the succeeding module.
The pipeline architecture and techniques herein described provide improved timing performance as compared to the conventional solutions. Further, the processing circuit is more easily extended by modifying existing modules or adding new modules to the pipeline. Because the registers are implemented inside respective modules, modifications to one module will not affect other modules.
Bus converter 10 provides an external interface to the internal register bus 30 to enable external applications to access the registers within the modules 20. The bus converter 10 receives register requests from external applications over an interface bus (not shown) and converts register requests from an external interface protocol used on the interface bus to an internal register bus protocol used on the internal register bus 30. The bus converter 10 forwards the converted register request to the first module 20A in the module pipeline. As will be described in greater detail below, the register request is sequentially passed from one module 20 to the succeeding module 20 until it arrives at the module 20 containing the targeted register. Upon receiving the register request, the receiving module 20 decodes the target register address and compares the decoded address to its assigned register address range to determine whether the target register belongs to the receiving module 20. If the target register address falls within the register address range of the receiving module 20, the module 20 latches the register data into or reads the register data from the matching register, i.e., the register having a register address matching the target register address. If the target register address falls outside the register address range of the receiving module 20, the receiving module 20 passes the register request to the succeeding module 20.
Each register 24 within the register group 22 has a corresponding register address within a predetermined register address range for the host module 20. It will be appreciated that the register address range of a module 20 comprises one or more addresses assigned to the registers 24 within the module 20, and that the register address range may be contiguous or discontiguous. The register group 22 has a first interface 21A connected by one internal bus segment 32 to a preceding module 20 or bus converter 10, and second interface 21B connected by another internal bus segment 32 to a succeeding module 20. The second interface 21B is not used by the last module 20 in the module pipeline, e.g., module 20G. The first and second interfaces are shown in
To access a register 24, an external application sends a register request to the bus converter 10. The bus converter 10 converts the register request into the internal register bus protocol and forwards the converted register request to the first module 20A. The register request includes a target register address that specifies a targeted register. The register request may comprise a write request or a read request. When a register request is received by a module 20 over the first interface 21A, the decoder 23 decodes the target register address associated with the register request and compares the target register address with the address range and/or the individual addresses of its registers 24 to determine whether the targeted register belongs to the module 20. If the targeted register does not belong to the module 21 decoder 23 outputs the register request over the second interface to the succeeding module 20 in the pipeline. If the targeted register belongs to the module 20, decoder 23 either latches the write data into the targeted register (write request), or reads the register data from the targeted register (read request).
The internal register bus interface comprises six interfaces. The first four interfaces listed in Table 1 provide the register request described herein to the receiving module 20, e.g., the write/read interface identifies whether the register request is a read request or a write request and the Reg_address interface carries the target register address. The remaining interfaces facilitate the read data passed up the pipeline as disclosed herein.
The processing circuit 5, module 20, and corresponding methods 100 and 200 disclosed herein have several benefits over conventional implementations, e.g., better timing performance, flexible update, and reduced power consumption. In particular, because each module 20 only decodes its own register address and the internal register bus is segmented, the timing issues of the conventional solutions are avoided. Further, a new module 20 may be added by connecting it into any stage of the register pipeline without requiring any modifications to the logic functions already implemented by the processing circuit. Similarly, an old module 20 may be removed from the processing circuit by disconnecting it from the register pipeline. Also, because most of the circuit power is consumed when the internal registers are toggled, and because using the pipeline structure disclosed herein reduces the register toggle rate because the pipeline structure terminates the register request when it arrives at the module 20 containing the target register, the pipeline structure disclosed herein reduces the circuit power consumption.
The present invention may, of course, be carried out in other ways than those specifically set forth herein without departing from essential characteristics of the invention. The present embodiments are to be considered in all respects as illustrative and not restrictive, and all changes coming within the meaning and equivalency range of the appended claims are intended to be embraced therein.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2011/000948 | 6/7/2011 | WO | 00 | 2/4/2014 |