The present invention relates to a processor, which comprises an address generator for generating an address based on a base address and a fractional step.
A processor can be used for processing data. The data may e.g. be stored in a memory, which is accessible by the processor. To access the data from the memory, the processor needs an address, at which the data is stored. The addresses that are needed for acquiring data for a certain application may be generated and temporarily stored in an address register. The address register may be continuously updated with new addresses during the progress of the application.
In many applications, a step for updating the address of the address register is not the same each time an updating operation is performed. The step may be either an increment or a decrement. Different applications may use different strategies for the steps. Furthermore, a step may even be fractional, e.g. a non-integer value, even though addresses of the memory are always located on integer positions. For example, if a step of 0.75 is needed, the following address sequence may be generated when the value of a starting address is 0: [0].75, [1].5, [2].25, [3].0, [3].75, [4].5, . . . , wherein the integer part of the address sequence indicated within brackets will be used for the memory access. Here, it can be seen that the address having value 3 will be used twice. Some applications where fractional updating is used are interpolation, scaling, resampling, synchronization, and table look-up.
Updating an address for the address register a fractional amount is normally done by software instructions, or a sequence of software instructions, run by the processor. Also, it is known in the art that hardware units, which are external to a processor, perform interpolation based on fractional steps. The interpolation is hence performed outside the processor itself. The addresses generated as part of an interpolation method are not accessible by the software instructions. Performing only a predetermined interpolation method by the hardware units has the consequence that the resulting interpolated data is accessible to other units but not the address used for the interpolation. Hence, the address can only be used for a predetermined purpose, such as the predetermined interpolation method. This makes the hardware units inflexible.
In US 2004/0003199 A1, a memory interface device is disclosed providing a fractional address interface between a data processor and a memory system. The memory interface provides for interpolation of data values. The memory interface includes an address generator for generating first and second memory addresses, and a memory access unit for retrieving first and second data values and for interpolating between the first and second data values.
An issue with updating an address for the address register by software run by the processor is that updating of the address is often done in kernel parts, such as inner loops, of the software. Normally, the inner loops are frequently being executed. Hence, any extra executed instructions to update the address register in a kernel part have a direct negative impact on the system performance. This issue is particularly severe for updating with a fractional step of an address for the address register, as all processor architectures available require executing a few extra software instructions to update the address by a fractional step.
In applications that use a fractional step for updating an address, the bottleneck of a core part of the processor is often getting data in and out from a data repository. One issue with getting data in and out is efficient updating of an address. As an example, a synth wave oscillator, which is basically an interpolator, needs fractional steps for updating an address for all input samples as well as all coefficients from a table. Consequently, even a small performance improvement in the updating with a fractional step of addresses may give substantial improvement of the performance of the system, wherein the address generation is utilized.
According to an embodiment of the invention, a processor for processing data comprises an address generator. The address generator is operative to generate an address based on a base address and an offset value, and to update said offset value by a fractional step. The processor may be a main processor, which comprises the address generator in a core part thereof. Alternatively, the processor may comprise a main processor and at least one co-processor operatively connected to the main processor. The co-processor comprises the address generator.
The address generator is operative to generate the address based on the base address and an offset value. The said offset value is updated by the fractional step.
The address generator is operative to generate the address based on a base address, which is a fractional base address.
The address generator may comprise a quantizer operative to generate the address based on the sum of the base address plus the offset value.
The address generator may be operative to generate the address based on a base address, which is an integer base address.
The address generator may comprise a quantizer operative to generate an integer offset value based on the offset value, which is based on the fractional step. The address generator may be operative to generate the address based on the integer offset value.
The address generator may comprise an adder operative to generate the sum of the base address plus the offset value.
The address generator may comprise an adder operative to generate and output the sum of an offset value, which is one of the offset value, which is based on the fractional step, and a input offset value, plus the fractional step. Additionally, the address generator may comprise a modulo counter operative to generate a subsequent offset value, to be used for generating a subsequent address, based on the output from the adder and a maximum offset value.
The address generator may comprise a multiplexer operative to output the offset value. The offset value may be one of the input offset value and an offset value generated by the modulo counter.
The address generator may be operative in response to at least one software instruction.
According to another embodiment, an electronic apparatus comprises the processor according to any of claims 1 to 12.
The electronic apparatus may be a mobile telephone. Further embodiments of the invention are defined in the dependent claims.
It should be emphasized that the term “comprises/comprising” when used in this specification is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.
Embodiments of the invention provide an efficient device for generating an address, which is accessible from any type of process, application, or operation. Thus, the address is not only accessible from a predetermined method.
Further objects, features and advantages of embodiments of the invention will appear from the following detailed description, reference being made to the accompanying drawings, in which:
Embodiments of the invention will be described with reference to the accompanying drawings. The invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. The terminology used in the detailed description of the particular embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, like numbers refer to like elements.
The address generator 2 is operative to generate an address 8 (
A step may be an increment. Alternatively, the step may be a decrement. A fractional step as used herein is a step, which is an integer or a non-integer amount or value. The fractional step may be used to update from a first address to a subsequent second address in a sequence of addresses. Thus, updating may be either incrementing or decrementing. A non-integer value may e.g. be a fixed-point number or a floating point number. Both integer values and non-integer values may be represented in binary form.
Processor 1a and main processor 5 may e.g. be a CPU (Central Processing Unit), a DSP (Digital Signal Processor), or a GPU (Graphics Processing Unit). A co-processor may be a special-purpose processor, which assists the main processor 5 in performing certain operations. The co-processor 6 extends the instruction set of the main processor 5. Hence, e.g. the efficiency of the whole system defined by the main processor 5 and the co-processor 6 is improved. In the embodiment of
The embodiment of
The embodiment of
A first input terminal of the first adder 101 is operatively connected to the first input terminal 107 of the address generator 100. A second input terminal of the first adder 101 is operatively connected to an output terminal of the multiplexer 106. An output terminal of the first adder 101 is operatively connected to an input terminal of the quantizer 102.
An output terminal of the quantizer 102 is operatively connected to an output terminal of the address generator 100.
A first input terminal of the second adder 103 is operatively connected to the output terminal of the multiplexer 106. A second input terminal of the second adder 103 is operatively connected to the fifth input terminal 111 of the address generator 100. An output terminal of the second adder 103 is operatively connected to a first input terminal of the modulo counter 104.
A second input terminal of the modulo counter 104 is operatively connected to the fourth input terminal 110 of the address generator 100. An output terminal of the modulo counter 104 is operatively connected to an input terminal of the register 105.
An output terminal of the register 105 is operatively connected to a first input terminal of the multiplexer 106.
A second input terminal of the multiplexer 106 is operatively connected to the second input terminal 108 of the address generator 100. A third input terminal of the multiplexer 106 is operatively connected to the third input terminal 109 of the address generator 100.
The address generator 100 is operative to generate the address 8 based on a base address and a fractional step. In the figures, the fractional step is denoted Δ.
In the embodiment of
The base address may point at a first address in a sequence of addresses that should be generated. The base address may be different when different sequences of addresses are generated. The base address may be set adaptively.
In the embodiment of
The first adder 101 is operative to generate the sum of the base address plus the offset value. In the embodiment of
Address:=Q(base address+offset value),
where base address and offset value may be non-integer values having a positive or negative sign, Q is the operation extracting the integer part of the operand, and address is the resulting address 8 having a non-fractional value.
In the embodiment of
For generating a first address of a sequence of addresses, the input offset value may be used. Then, for any subsequent offset value for the sequence of addresses to be generated, offset values from the register 105 may be used. The register 105 may be set before a sequence of addresses is to be generated. An initial value of the register 105 may e.g. be 0.
The second adder 103 may be a step adder. The second adder 103 is operative to add the fractional step to the offset value provided by the multiplexer 106. Thus, the offset value is updated a fractional amount. The output of the second adder 103 is forwarded to the modulo counter 104. The second adder 103 is operative to generate and output the sum of the offset value plus the fractional step. The offset value may be one of the offset value, which is based on the fractional step, and the input offset value.
The modulo counter 104 is operative to generate a subsequent offset value, to be used for generating a subsequent address. The modulo counter 104 is operative to count modulo N based on the updated offset value provided by the second adder 103. The modulo counter 104 may e.g. perform the following updating operation:
offset valuen+1:=mod(offset valuen+Δn,Nn),
where mod represents the modulo operation, Δ the fractional step, and n specifies each access cycle. The updating operation may, but does not have to, be performed as part of reading the generated address.
The modulo counter 104 is operative to ascertain that the offset value will be non-negative and will remain less than a maximum offset value N. Thus, it is ascertained that an address having a value, which exceeds a maximum value, is not generated. Consequently, only addresses within a certain address range determined by the base address and the maximum offset value N will be generated by the address generator 100. For example, assume N=4. Then, if the offset value, on which the modulo counter 104 operates, is 3.75, the modulo counter 104 will output the offset value 3.75. However, if the offset value, on which the modulo counter 104 operates, is 4.75, the modulo counter 104 will output the offset value 0.75.
The register 105 is operative to temporarily store one or several offset values until it/they is/are needed by the multiplexer 106.
The embodiment of the address generator 200 illustrated in
The basic difference between the embodiments of
Quantizer 202 is operative to provide an integer offset value, which is based on the fractional step. In the embodiment of
The first adder 201 of the embodiment of
Providing the base address and the offset value as integer values allows e.g. for a less complex implementation of the first adder 201. Thus, a processor 1a, 1b comprising the address generator 200 of
The fractional step may be a positive or negative fractional step. Furthermore, the base address value may be any address of a range of addresses when negative fractional steps are allowed.
In the embodiment of
In some embodiments, N may be an integer or fractional value. Thus, the module counter 104 may be operative on N being an integer and/or fractional N. In other embodiments, the module counter 104 is only operative on N being an integer value. The latter module counter is less complex to implement, allowing e.g. for a less complex processor 1a, 1b, compared to using N having a fractional value.
The address generator 100, 200 is operative in response to at least one software instruction. The software instruction, in response to which the address generator is responsive, may e.g. comprise the input value(s) at any of the first input terminal 107, the second input terminal 108, the third input terminal 109, the fourth input terminal 110, and the fifth input terminal 111. Alternatively or additionally, the software instruction, in response to which the address generator 100, 200 is responsive, may e.g. either explicitly or indirectly from e.g. a register provide the input values at the first input terminal 107, the second input terminal 108, the third input terminal 109, the fourth input terminal 110 and the fifth input terminal 111. Still alternatively or additionally, the software instruction may be an instruction to request the address 8. Consequently, the processor 1a, 1b has the advantage that the address 8 may be generated in response to a software instruction of any process, application, or operation, whereby high flexibility is achieved.
The input values at the first input terminal 107, the second input terminal 108, the fourth input terminal 110, and the fifth input terminal 111 of the address generator 100, 200, may be provided by the processor 1a, 1b. The input values may be generated when the processor 1a, 1b runs software instructions for e.g. a certain process, application or operation. In some embodiments, all input values are generated before an instruction loop for a process, an application, or an operation is entered. Then, the addresses are generated in response to the control signal applied on the third input terminal 109. The control signal may be generated in response to executing an instruction, e.g. during an inner instruction loop. In other embodiments, all input values except the fractional step is generated before one or several inner instruction loops for a process, an application, or an operation is/are entered. Then, the fractional step is adaptively generated within the inner instruction loop(s). An address may then be generated in response to providing the fractional step. Consequently, in embodiments of the invention, a single instruction comprising the fractional step or the control signal is sufficient for providing a new address. The software instruction, in response to which the address generator 100, 200 is responsive, may comprise reading a generated address 8 as well as providing a new fractional step. This allows for efficient generation of the address 8. For inner instruction loops comprising few instructions, this allows for a substantial performance boost, as the percentage of the instructions relating to the generation of the address 8 in the inner instruction loop is substantially reduced compared to generating the address using a prior-art-processor executing software instructions only, where the address is generated entirely in the inner loop. Consequently, the performance will be substantially improved with embodiments of the invention. Also, the address generation can be performed in parallel with other operations performed by the core part 3 of the processor 1a (see
The generated address (8) provided by embodiments of the invention may, for example, be used by the processor (1a, 1b) to access a memory and/or a register. The memory/register access may for example provide for reading from and/or writing to the memory/register.
Embodiments of the invention provide for generating addresses that are not subsequent addresses. This is e.g. possible by adjusting the fractional step and/or the value of the base address accordingly, e.g. by having a fractional step>1. Moreover, this allows e.g. for improved interpolation possibilities, as it provides for interpolation between data values that are not neighboring. Neighboring data values are data values that are stored on neighboring addresses of a memory. Addresses are neighboring if they are subsequent addresses. Other interpolation methods may also be used, such as using data values from subsequent addresses. If so, the same fractional step may be used for generating multiple addresses. Thus, the processor 1a, 1b according to embodiments of the invention may be used for several different interpolation methods without any modifications of the hardware, which e.g. provides flexibility.
Embodiments of the invention also allows for generating addresses based on a fractional offset, and a fractional step. In some embodiments, also the value of the base address is a fractional value. As the input values to the address generator 2, 100, 200 may be provided in a plurality of different ways, the address generator 2, 100, 200, and thus the processor 1a, 1b, is flexible. A fractional base address may be used to obtain addresses that are rounded to nearest integer address by increasing e.g. the base address by 0.5
The input values to the address generator 2, 100, 200 may e.g. be provided by a process, an application, or an operation. The address generator 2, 100, 200 may be independent of the purpose, for which the generated address 8 should be used. Thus, embodiments of the invention provides for using a single address generator 2, 100, 200 for generating addresses based on fractional steps, which addresses are used for multiple purposes. Thus, the addresses may be requested from multiple and/or different processes, applications, or operations. Therefore, embodiments of the invention provide a cheap and space saving design, as multiple address generators are not necessary.
The processor 1a, 1b and the address generator, 2, 100, 200 may be provided in hardware comprising hardwired components.
The present invention has been described above with reference to specific embodiments. However, other embodiments than the above described are possible within the scope of the invention. The different features of the invention may be combined in other combinations than those described. The scope of the invention is only limited by the appended patent claims.
Number | Date | Country | Kind |
---|---|---|---|
06111687.7 | Mar 2006 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP07/52820 | 3/23/2007 | WO | 00 | 9/18/2008 |
Number | Date | Country | |
---|---|---|---|
60745794 | Apr 2006 | US |