The present invention relates to optimization of hardware used in data processing.
Data processing requires the optimization of the available resources, as well as the power consumption of the circuits involved in data processing. This is the case in particular when reconfigurable processors are used.
Reconfigurable architecture includes modules (VPU) having a configurable function and/or interconnection, in particular integrated modules having a plurality of unidimensionally or multidimensionally positioned arithmetic and/or logic and/or analog and/or storage and/or internally/externally interconnecting modules, which are connected to one another either directly or via a bus system.
These generic modules include in particular systolic arrays, neural networks, multiprocessor systems, processors having a plurality of arithmetic units and/or logic cells and/or communication/peripheral cells (IO), interconnecting and networking modules such as crossbar switches, as well as known modules of the type FPGA, DPGA, Chameleon, XPUTER, etc. Reference is also made in particular in this context to the following patents and patent applications of the same applicant:
P 44 16 881.0-53, DE 197 81 412.3, DE 197 81 483.2, DE 196 54 846.2-53, DE 196 54 593.5-53, DE 197 04 044.6-53, DE 198 80 129.7, DE 198 61 088.2-53, DE 199 80 312.9, PCT/DE 00/01869, DE 100 36 627.9-33, DE 100 28 397.7, DE 101 10 530.4, DE 101 11 014.6, PCT/EP 00/10516, EP 01 102 674.7, PCT/DE 97/02949 (PACT02/PCT), PCT/DE 97/02998 (PACT04/PCT), PCT/DE 97/02999 (PACT05/PCT), PCT/DE 98/00334 (PACT08/PCT), PCT/DE 99/00504 (PACT10b/PCT), PCT/DE 99/00505 (PACT10c/PCT), DE 101 39 170.6 (PACT11), DE 101 42 903.7 (PACT11a), DE 101 44 732.9 (PACT11b), DE 101 45 792.8, (PACT11c), DE 101 54 260.7 (PACT11d), DE 102 07 225.6 (PACT11e), PCT/DE 00/01869 (PACT13/PCT), DE 101 42 904.5 (PACT21), DE 101 44 733.7 (PACT21a), DE 101 54 259.3 (PACT21b), DE 102 07 226.4 (PACT21c), PCT/DE 00/01869 (PACT13/PCT), DE 101 10 530.4 (PACT18), DE 101 11 014.6 (PACT18a), DE 101 46 132.1 (PACT18II), DE 102 02 044.2 (PACT19), DE 102 02 175.9 (PACT19a), DE 101 35 210.7 (PACT25), DE 101 35 211.5 (PACT25a), DE 101 42 231.8 (PACT25aII), (PACT25b). The entire contents of these documents are hereby included for the purpose of disclosure.
The above-mentioned architecture is used as an example to illustrate the present invention and is referred to hereinafter as VPU. The architecture includes an arbitrary number of arithmetic, logic (including memory) and/or memory cells and/or networking cells and/or communication/peripheral (IO) cells (PAEs—Processing Array Elements) which may be positioned to form a unidimensional or multidimensional matrix (PA); the matrix may have different cells of any desired configuration. Bus systems are also understood here as cells. A configuration unit (CT) which affects the interconnection and function of the PA through configuration is assigned to the entire matrix or parts thereof. The configuration of a VPU is determined by writing configuration words into configuration registers. Each configuration word determines a subfunction. PAEs may require a plurality of configuration words for their configuration, e.g., one/or more words for the interconnection of the PAE, one/or more words for the clock determination and one/or more words for the selection of an ALU function, etc.
Generally, a processor which is operated at a higher clock frequency requires more power. Thus, the cooling requirements in modern processors increase substantially as the clock frequency increases. Moreover, additional power must be supplied which is critical in mobile applications in particular.
To determine the clock frequency for a microprocessor based on the state is known. Such technologies are known from the area of mobile computers. However, problems arise in the overall speed with which certain applications are carried out.
An object of the present invention is to provide a novel method for commercial application.
In an example embodiment of the present invention, the power consumption may be reduced and/or optimized in VPU technology. As far as different methods are addressed in the following, it should be pointed out that they provide advantages, either individually or in combination.
In a data processing unit (VPU) according to a first aspect of the present invention, by using a field of clocked logic cells (PAEs) which is operable in different configuration states and a clock preselecting means for preselecting logic cell clocking, the clock preselecting means is designed in such a way that, depending on the state, a first clock is preselected at least at a first cell (PAE) and an additional clock is preselected at least at an additional cell (PAE).
It is therefore suggested to operate different cells using different clocking. As a rule, the additional clock corresponds to the first clock; the former is thus situated in a defined phase angle to the latter. In order to achieve optimum data processing results, in particular with regard to the required data processing time, as well as the power consumption of the entire data processing unit, it is suggested that clocking takes place depending on the state, which means that no clock is preselected jointly for all cells based on a certain state, but rather an appropriate clock is assigned to each cell based on the state.
Furthermore, it is suggested that the clocking be designed to be totally configurable, so that one calibration (configuration) mutually influences the clocking of the total number of cells.
It is possible and desired that the clock preselecting means is designed in such a way that it receives the setpoint clock for at least one first cell from a unit which preselects configuration states. This makes it possible to select the clocking of the cell based on its configuration as soon as this configuration is determined. This has the advantage that configuration may take place free of problems.
The unit preselecting configuration states may be a compiling unit, which means that required or desired clocking of the cell is already determined during the compiling of the program. If the compiling unit preselects the configuration states, then the cell configuration preselecting unit may convey clocking for cell configuration to a cell to be configured. This is advantageous since it is possible to merely add clock-determining information to the configuration word or the configuration instruction with which the configuration of a cell is determined, without additional measures being required such as the implementation of clock-assigning buses which separately transmit the clock-determining signals, or the like; it should be noted that this is possible in principle.
It may also be provided that the clock preselecting means is designed in such a way that it receives the setpoint clock or a clock-influencing signal from one of the other logic cells, in particular a configurable logic cell. This is particularly advantageous if a first logic cell awaits an input signal from an external unit and not until arrival of such signals are the cells to be activated which process subsequently arriving signals. This makes it possible to implement a logic field sleeping mode in which only one or a plurality of cells are activated, if necessary, on a very low level, i.e., very Blow clocking, and the remaining field is clocked extremely slowly. The clock frequencies required in the remaining field are dependent on physically necessary clocking which is required for the preservation of memory contents or the like.
It is also advantageous to receive a clock-influencing signal from another logic cell if, using one logic cell, one or a series of a plurality of different arithmetic and/or logical operations may be executed which, at least in part, require a different number of clock cycles, but this may not be determined in advance by the compiling unit. Also in such a case, the subsequent cells do not need to be operated at a high clock frequency if they are appropriately clocked down by corresponding signals which indicate the state of the cell participating in a processing sequence.
In a preferred variant, the clock preselecting means includes a central clock preselecting unit, e.g., a central clock generator, whose clock is transmitted to the individual cells via a clock line, as well as a local clock-generating unit for generating a local clock from and/or in response to the central clock transmitted via the clock line. In a possible embodiment, clocking of the central clock preselecting unit may be set or influenced by a configuration. The local clock-generating unit is preferably implemented by using a frequency divider and/or a frequency multiplier, and the frequency divider ratio is preferably determined by the preselections of the clock preselecting means according to the clock determination based on the state.
In a preferred variant, the logic cells or at least some of the logic cells include at least one ALU and/or are formed by such. It is possible and preferred if some of the logic cells contain at least one memory unit and/or register unit which may be assigned to the remaining logic cells. In particular, this unit may be provided for data to be processed and/or for configurations of the cell.
It is possible that a plurality of logic cells are identical and are operated using different clocking corresponding to their particular configuration. It is possible in particular that all logic cells are identical.
A method is also provided for operating a field of clocked logic cells which may be set into different configuration states, a first state being determined, at least temporarily, for at least one first cell, a clock which is to be assigned to the first cell being determined dependent on the first state and the cell being operated using this clock; a second state is determined for at least one additional cell, a second clock which is to be assigned to the second cell being determined dependent on the second state and the second cell being operated using the second clock which differs from the first clock.
As mentioned above, clocking may be preselected together with the configuration. The state is then the configuration state and/or is at least determined by it.
In known and configurable logic cells, cells are typically combined in groups for executing complex operations. If individual cells execute suboperations which run in fewer clock cycles as is the case with those cells which are [engaged] in particularly drawn-out suboperations of the complex total operations executed by the group, it is preferred if these cells are operated at different clock rates, namely in such a way that the cells for less complex operations, thus operations which run in fewer clock cycles, are clocked slower than the other cells; it is preferred in particular if the cells of one group are clocked collectively in such a way that the number of blank cycles within the group is minimized. An alternative and/or an addition to this lies in the fact of temporarily changing the use of cells burdened with less complex tasks for a certain number of clock cycles, thus changing the use during a fixed number of clock cycles.
In particular, the case may occur that the maximum clock cycle rate of PAEs and/or PAE groups is limited by their function and in particular by their interconnection. The propagation time of signals via bus systems plays an increasingly frequency-limiting role, in particular in advancing semiconductor technology. Henceforth, the method allows slower clocking of such PAEs and/or PAE groups, while other PAEs and/or PAE groups operate at a different and, if needed, higher frequency. It is suggested in a simplified embodiment to make the clock rate of the entire reconfigurable module (VPU) dependent on the maximum clock rate of the slowest PAE and/or PAE group. In other words, the central clock preselecting unit may be configured in such a way that the highest mutual operating clock of all PAEs and/or PAE groups (in other words the smallest common denominator of all maximum clock rates) is globally generated for all PAEs.
The above-described method is particularly advantageous if the cells of the group process data sequentially, i.e., the result determined by one cell is passed on to one or multiple cells which are subsequently processing data.
It should be noted that in addition to prioritizing tasks within the cell field for clock preselection, the condition of a power source may also be included in cell clocking determination. Clocking may be reduced overall in the case of a drop in supply voltage, in particular in mobile applications. Clocking-down for preventing an overtemperature by responding to a temperature sensor signal or the like is equally possible. It is also possible for the user to preset the clock preselection. Different parameters may jointly establish the clock-determining state.
It was mentioned above that it is possible to perform time division multiplexing for carrying out multiple configurations on the same PAE. A preferred and enhanced design makes particularly resource-saving time division multiplexing for carrying out multiple configurations on the same PAE possible; the design may have advantages independently from the different clocking of individual cells, e.g., when latencies have to be taken into account which occur in the signal transmission of digital data via a bus, such as configuration data, data to be processed, or the like. These problems are particularly serious when reconfigurable modules, having reconfigurable units which are located in part comparatively far apart from one another, are to be operated at high clock frequencies. The problem arises here that due to the special configuration of VPUs, a plurality of arbitrary PAEs is connected via buses and considerable data transmission traffic exists via the buses. The switching frequency of transistors is expected to further increase in modern and above all in future silicon technologies, while the signal transmission via buses is to increasingly become a performance-limiting factor. It is therefore suggested to decouple the data rate or frequency on the buses vis-a-vis the operating frequency of the data-processing PAEs.
A particularly simple embodiment, preferred for simple implementations, operates in such a way that the clock rate of a VPU is only globally settable. In other words, a settable clock may be preselected for all PAEs or it may be configured by a higher-level configuration unit (CT). All Parameters which have an effect on clocking determine this one global clock. Such parameters may be, for example, a temperature determination, a power reserve measurement of batteries, etc.
A determining parameter may be in particular the maximum operating frequency of the slowest configuration which results as a function of a PAE configuration or a configuration of a group of PAEs. Since different configurations may include different numbers of PAEs over stretches of bus connections of different lengths, it was realized, in particular in bus signal transmission-limiting applications, that configurations may have different maximum frequencies. Configurations may have different maximum frequencies, as is known from FPGAs, for example, which depend on the particular function of the PAEs and in particular on the lengths of bus connections. The slowest configuration then ensures that the proper operation of this configuration is also ensured, and simultaneously reduces the power demand of all other configurations which is advantageous in particular when different portions of the data processing such as through the other configurations, which would possibly run at higher clock frequencies, are not needed prior to the slowest configuration. Also in cases where it must be absolutely ensured that proper operation takes place, the possibly only negligible performance loss occurring by clocking-down other configurations, which could run faster per se, is often acceptable.
In an optimized embodiment, the frequency is adapted only to the configurations which are currently carried out on a VPU, in other words, the global frequency may be reset/reconfigured with each configuration.
In an enhanced embodiment, the clock may then be configured globally, as well as, as described above, individually for each configurable element.
It should be noted that different variants are possible, individually or in combination. In order to show a detailed example, it is assumed in the following, without this necessarily being the case, that the clock may be controlled individually in each PAE. This offers the following possibilities, for example:
a) Controlled Enabling and Disabling of the Clock
It is preferred that the processing clock of PAEs is disabled, i.e., the PAEs operate only in case of need; clock enabling, i.e., activating the PAE, may take place, for example, under at least one of the following conditions, namely
when valid data is present; when the result of the previous computation is approved; due to one or more trigger signals; due to an expected or valid timing mark, compare DE 101 10 530.4 (PACT18).
In order to cause clock enabling, each individual condition may be used either individually or in combination with other conditions, clock enabling being computed based on the logical combination of conditions. It should be noted that it is possible to put the PAEs into a power-saving operating mode while a clock is disabled, for example, through additionally partly switched-off or reduced power supply, or, should it be necessary because of other reasons, through extremely reduced sleeping clocks.
b) Different Frequencies per PAE
Technologies for controlling sequences in VPUs are known from PCT/DE 97/02949 (PACT02/PCT), PCT/DE 97/02998 (PACT04/PCT), and PCT/DE 00/01869 (PACT13/PCT). Special sequencers (SWTs) which control a large number of PAEs and which are responsible for their (re)configuration are configured in PCT/DE 97/02998 (PACT04/PCT). The (re)configuration is controlled by using status signals which are generated by the PAEs (triggers) and passed on to the SWTs, namely in that the SWT responds to the triggers, making the particular continuation of a sequence dependent on the triggers.
A small memory for their configuration is assigned to each individual PAE in PCT/DE 97/02949 (PACT02/PCT). A sequencer passes through the memory and addresses the individual configurations. The sequencer is controlled by triggers and/or by the status of its PAE (into which it may be integrated, for example).
During data processing, it is now possible that different sequencers in different PAEs have to carry out a different number of operations per transmitted data packet (compare DE 101 39 170.6 (PACT11), DE 101 42 903.7 (PACT11a), DE 101 44 732.9 (PACT11b), DE 101 45 792.8 (PACT11c), DE 101 54 260.7 (PACT11d), DE 102 07 225.6 (PACT11e), PCT/DE 00/01869 (PACT13/PCT)). This is described using a configuration as an example in which 3 sequencers are involved in processing a data packet, requiring a different number of operations for data packet processing. Example:
In order to obtain an optimum operation/power consumption ratio, the individual sequencers would have to be clocked as follows:
Fmax=FSeq2/4=FSeq1/2=FSeq3
or at a maximum operating frequency of, for example, 100 MHz: FSeq1=50 MHz, FSeq2=25 MHz, FSeq332 100 MHz.
It is suggested in particular to use different clock sources for each PAR and/or group of PAEs. For example, different techniques may be used for this purpose, either individually or jointly:
An exemplary embodiment having different algorithms is illustrated in
c) Configuration Clock
Optimization of the power consumption is also favored in that the circuit components, necessary for executing a configuration, are clocked selectively, i.e., it is suggested to clock each PAE addressed and/or to completely disable the clock of those circuit components necessary for executing a configuration or a reconfiguration when no configuration or reconfiguration is being executed and/or to use static registers.
In particular example embodiments, the operating frequency of the PAEs or groups of PAEs may be made dependent on different and/or additional factors. The following is listed below as an example:
1. Temperature Measurement
If the operating temperature reaches certain threshold values, the operating clock is reduced correspondingly. The reduction may take place selectively by initially operating those PAEs on a lower clock which represent the most irrelevant performance loss.
In a particularly preferred embodiment, multiple temperature measurements may be performed in different regions and clocking may be adapted locally.
2. Buffer Filling Levels
IO-FIFOs (input-output-first-in-first-out-circuits) which decouple peripheral data transmissions from data processing within a VPU are described in DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT15a), (PACT15b). One buffer for input data (input buffer) and/or one buffer for output data (output buffer) may be implemented, for example. A particularly efficient variable for determining the clock frequency may, for example, be determined from the filling level of the particular data buffers. The following effects and measures may occur, for example;
Depending on the application and the system, suitable combinations may be implemented accordingly.
It should be pointed out that such a clock frequency determination is implementable if a filling level determination means for a buffer, in particular an input and/or output buffer, alternatively also an intermediate buffer within a VPU array, is provided and if this filling level determination means is connected to a clock preselecting means for preselecting logic cell clocking so that this clock preselecting means is able to change the logic cell clocking in response to the buffer filling level.
3. Battery Charge State
It is imperative to be careful with the power supply, e.g., a battery, for mobile units. Depending on the power reserve, which may be determined based on the existing methods according to the related art, the frequency of PAEs and/or groups of PAEs is determined and is reduced in particular when the power reserve is low.
Besides or in addition to optimizing data processing clocking it is also possible to accomplish an optimization of the data transmission with respect to the relationship between data transmission and data processing.
In a particular embodiment, the clock controls of PAEs described may be enhanced in such a way that, by using a sequencer-like activation and a suitable register set, for example, multiple, preferably different, configuration words may be executed successively in multiple clocks. A sequencer, sequentially processing a number of configuration inputs, may be additionally assigned to the configuration registers and/or to a configuration memory which is possibly also decoupled and implemented separately (compare DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT15a, PACT15b). The sequencer may be designed as a microcontroller. In particular, the sequencer may be programmable/configurable in its function such as Altera's module EPS448 (ALTERA Data Book 1993). Possible embodiments of such PAEs are described, for example, in the following patent applications which are included in their entirety for the purpose of disclosure: PCT/DE 97/02949 (PACT02/PCT), PCT/DE 97/02998 (PACT04/PCT), PCT/DE 00/01869 (PACT13/PCT), DE 101 10 530.4 (PACT18), DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT15a, PACT 15b).
For the following, it is initially assumed that multiple configuration words are combined into one configuration (PACKEDCONF) and are configured on a PAE. The PACKEDCONF is processed in such a way that the individual configuration words are executed in chronological succession. The data exchange and/or status exchange between the individual timed configurations takes place via a suitable data feedback in the PAEs; for example by using a suitable register set and/or another data exchange and/or status exchange means such as suitable memories and the like.
This method allows a different timing for PAEs and bus systems. While PAEs process data at very high clock rates, for example, operands and/or results are transmitted via a bus at only a fraction of the clock rate of the PAEs. The transmission time via a bus may be correspondingly longer.
It is preferred if not only the PAEs or other logic units in a configurable and/or reconfigurable module are clockable at a different rate, but also if different clocking is provided for parts of a bus system. It is possible here to provide multiple buses in parallel whose speed is clocked differently, i.e., a bus which is clocked particularly high for providing a high-performance connection, parallel to a bus which is clocked lower for providing a power-saving connection. The connection clocked high may be used when longer signal paths have to be compensated, or when PAEs, positioned close together, operate at a high frequency and therefore also have to exchange data at a high frequency in order to provide a good transmission here over short distances in which the latency plays a minor role at best. Therefore, it is suggested in a possible embodiment that a number of PAEs, positioned together locally and combined in a group, operate at a high frequency and possibly also sequentially and that local and correspondingly short bus systems are clocked high corresponding to the data processing rate of the group, while the bus systems, inputting the operands and outputting the results, have slower clock and data transmission rates. For the purpose of optimizing the power consumption, it would be alternatively possible to implement slow clocking and to supply data at a high speed, e.g., when a large quantity of inflowing data may be processed with only a minor operational effort, thus at low clock rates.
In addition to the possibility of providing bus systems which are clocked using different frequencies it is also possible to provide multiple bus systems which are operable independently from one another and to then apply the PAEs in a multiplex-like manner as required. This alone makes it possible to operate reconfigurable modules particularly efficiently in resource multiplexing, independently from the still existing possibility of differently clocking different bus systems or different bus system parts. It is possible here to assign different configurations to different resources according to different multiplexing methods.
According to PCT/DE 00/01869 (PACT13/PCT), a group of PAEs may be designed as a processor in particular.
In the following embodiments, for example, different configurations are assigned to data-processing PAEs using time-division multiplexing, while bus systems are assigned to the different configurations using space-division multiplexing.
In the assignment of resources, i.e., the assignment of tasks to PAEs or a group of PAEs to be carried out by the compiler or a similar unit, the given field may then be considered as a field of the n-fold variable and code sections may be transferred to this field of resources, which is virtually scaled up by the factor n, without the occurrence of problems, particularly when code sections are transferred in such a way that no interdependent code sections have to be configured into a PAE which is used in a multiplex-like manner.
In the previous approach, a PACKEDCONF was composed of at least one configuration word or a bundle of configuration words for PAEs which belong to one single application. In other words, only configuration words which belong together were combined in the PACKEDCONF.
In an enhanced embodiment, at least one or more configuration words per each different configuration are entered into a PACKEDCONF in such a way that the configuration word or words which belong together in a configuration are combined in a configuration group and the configuration groups thus created are combined in the PACKEDCONF.
The individual configuration groups may be executed in chronological succession, thus in time-division multiplexing by a timeslice-like assignment. This results in time division multiplexing of different configuration groups on one PAE. As described above, the configuration word or the configuration words within a configuration group may also be executed in chronological succession.
Multiplexers which select one of the configuration groups are assigned to the configuration registers and/or to a configuration memory, which is possibly also decoupled and implemented separately (compare DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT15a, PACT 15b)). In an enhanced embodiment, a sequencer (as described above) may be additionally assigned which makes the sequential processing of configuration words within configuration groups possible.
Using the multiplexers and the optional sequencer, a resource (PAE) may be assigned to multiple different configurations in a time-division multiplex method.
Among one another, different resources may synchronize the particular configuration group to be applied, for example by transmitting a configuration group number or a pointer.
The execution of the configuration groups may take place linearly in succession and/or cyclically, with a priority being observed. It should be noted here in particular that different sequences may be processed in a single processor element and that different bus systems may be provided at the same time so that no time is wasted in establishing a bus connection which may take some time due to the long transmission paths. If a PAR assigns its first configuration to a first bus system and, on execution of the first configuration, couples the same to the bus system, then it may, in a second configuration, couple a different or partially different bus system to the former if spacial multiplexing for the bus system is possible.
The execution of a configuration group, each configuration group being composed of one or more configuration words, may be made dependent on the reception of an execution release via data and/or triggers and/or an execution release condition.
If the execute release (condition) for a configuration group is not given, the execute release (condition) may either be awaited, or the execution of a subsequent configuration group may be continued. The PAEs preferably go into a power-saving operating mode during the wait for an execute release (condition), for example with a disabled clock (gated clock) and/or partially disabled or reduced power supply. If a configuration group cannot be activated, then, as mentioned above, the PAEs preferably also go into a power-saving mode.
The storage of the PACKEDCONF may take place by using a ring-type memory or other memory or register means, the use of a ring-type memory resulting in the fact that after the execution of the last input, the execution of the first input may be started again (compare PCT/DE 97/02998 (PACT04/PCT)). It should be noted that it is also possible to skip to a particular execution directly and/or indirectly and/or conditionally within the PACKEDCONF and/or a configuration group.
In a preferred method, PAEs may be designed for processing of configurations in a corresponding time-division multiplexing method. The number of bus systems between the PAEs is increased such that sufficient resources are available for a sufficient number of configuration groups. In other words, the data-processing PAEs operate in a time-division multiplex method, while the data-transmitting and/or data-storing resources are adequately available.
This represents a type of space division multiplexing, a first bus system being assigned to a first temporarily processed configuration, and a second bus system being assigned to an additional configuration; the second bus system runs or is routed spacially separated from the first bus system.
It is possible at the same time and/or alternatively that the bus systems are also entirely or partially operated in time-division multiplexing and that multiple configuration groups share one bus system. It may be provided here that each configuration group transmits its data as a data packet, for example, a configuration group ID being assigned to the data packet (compare APID in DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT 15a, PACT 15b)). Subsequently it may be provided to store and sort the particular data packets transmitted based on their assigned identification data, namely between different buses if required and for coordinating the IDs.
In an enhanced method, memory sources may also be run in a time-division multiplex, e.g., by implementing multiple segments and/or, at a change of the configuration group, by writing the particular memory/memories as described in PCT/DE 97/02998 (PACT04/PCT) and/or PCT/DE 00/01869 (PACT13/PCT) into a different or even external memory or by loading from the same. In particular the methods as described in DE 102 06 653.1 (PACT15), DE 102 07 224.8 (PACT15a, PACT 15b) may be used (e.g., MMU paging and/or APID).
The adaptation of the operating voltage to the clock should be noted as a further possibility for conserving resources.
Semiconductor processes typically allow higher clock frequencies when they are operated at higher operating voltages. However, this causes substantially higher power consumption and may also reduce the service life of a semiconductor.
An optimum compromise may be achieved in that the voltage supply is made dependent on the clock frequency. Low clock frequencies may be operated at a low supply voltage, for example. With increasing clock frequencies, the supply voltage is also increased (preferably up to a defined maximum).
The present invention, as an example, is explained in greater detail below with reference to the Figures. It should be noted that this exemplary description is not limiting and that in isolated cases and in different figures identical or similar units may be denoted using different reference numbers.
As an example,
According to
Furthermore, a multiplexer 0213 for selecting different configurations and/or configuration groups may optionally be integrated dependent on 0212. Furthermore, the multiplexer may optionally be activated by a sequencer 0214 in order to make sequential data processing possible. In particular, intermediate results may be managed in data memory 0207.
While the general configuration of the cell was described in part in the applicant's applications described above, the presently described clock dividing system, the associated circuit, and the optimization of its operation are at least novel and it should be pointed out that these facts may and shall be associated with the required hardware changes.
The entire system and in particular configuration unit 0103 is designed in such a way that, together with a configuring signal with which a configuration word is fed via configuration line 0103a via configuration word extractor 0209 to data processing unit 0206 or upstream and/or downstream and/or associated memory 0208, a clock dividing/multiplying signal may also be transmitted which is extracted by configuration word extractor 0209 and transmitted to frequency divider/multiplier 0210, so that, as a response, 0210 may clock data processing unit 0206 and possibly also other units. It should be pointed out that, as a response to an input signal to the cell, there are also other possibilities instead of unit 0209 to vary clocking of an individual data processing unit 0206 with reference to a central clock unit 0104, via data bus monitoring circuit 0212, for example.
Described only as an example with reference to
For example, a 3×3 field of reconfigurable cells is configured in such a way, according to
If the processor unit having the separately clockable reconfigurable logic cells is operated in an application where the voltage may drop, e.g., due to exhausting voltage supply capacities, it may be provided that, at a drop in the supply voltage, the entire frequency is reduced to a critical value U1; all cells are subsequently clocked slower by one half so that division cell 0102h too runs only at 128 MHz, while cell 0102d is clocked at 4 MHz. Cell 0102a, executing a query of the mouse pointer having a lower priority, is no longer clocked at 8 MHz as previously but rather at 2 MHz, i.e., depending on the prioritization, different slowdowns according to the importance of the task are assigned to the respective groups at a voltage drop or under other circumstances.
If, for other reasons, the temperature still rises, the heat generation in the logic cell field may be further reduced by an additional clock rate reduction for the logic cells, as is indicated in the last row of
This makes it possible to optimally operate a processor field energy-efficiently; the cooling capacity required is reduced and it is clear that, since as a rule not all cells may and/or must be permanently operated at the highest clock frequency, heat sinks and the like may be dimensioned appropriately smaller which in turn offers additional cost advantages.
It should be noted that in addition to the query regarding a supply voltage, a temperature, the prioritization of computations, and the like, other conditions may determine the clock. For example, a hardware switch or a software switch may be provided with which the user indicates that only low clocking or higher clocking is desired. This makes an even more economical and targeted handling of the available power possible. It may be provided in particular that, at the user's request or at an external request, the central clock rate in total may be reduced; the clock divider ratios within the cell array, however, are not changed in order to avoid the requirement of reconfiguring all cells, e.g., at an extreme temperature rise. Moreover, it should be pointed out that a hysteresis characteristic may be provided in determining the clock rates, when a temperature-sensitive change of the clock frequencies is to be performed, for example.
The data transmission occurring on data bus 0205a/b is illustrated in
In order to execute op1, operands is mist be available via 0205a (0601); the data transmissions for the remaining cycles may be undefined in principle.
Thereafter, 0205a may preferably transmit the subsequent operands (0602) for which the execution time of op2, op3, op4, op5 is available, thus creating a temporal decoupling, allowing the use of slower and/or, in particular, longer bus systems.
During the execution of op2, op3, op4, op5, data of other configurations may alternatively (0603) be transmitted via the same bus system 0205a using a time-division multiplex method.
Following op5, result oa is applied to bus 0205b (0601); the data transmissions for the remaining cycles may be undefined in principle.
The time prior to op5, i.e., during the execution of op1, op2, op3, op4, may be used for transmitting the previous result (0602). This again creates a temporal decoupling, allowing the use of slower and/or, in particular, longer bus systems.
During the execution of op1, op2, op3, op4, data of other configurations may alternatively (0603) be transmitted via the same bus system 0205b using a time-division multiplex method. For clock multiplication, 0210 may use a PLL. A PLL may be used in particular in such a way that the operating clock of the PAE for executing op1, op2, op3, op4, op5 is five times that of the bus clock. In this case, the PAE may act as a PAE without a sequencer having only one (unicyclical) configuration and the same clock as the bus clock.
One configuration group may contain multiple configuration words (ga={ka1, ka2}, gb={kb1}, gc={kc1, kc2, kc3}). The configuration words may be executed sequentially in 0214 using a sequencer.
In addition, a possible bus transmission using a time-division multiplex for the bus systems is illustrated in 0704. The input data of all groups is transmitted via an input bus system and the output data of all groups is transmitted via an output bus system. The undefined intermediate cycles are either unused or are free for other data transmissions.
Number | Date | Country | Kind |
---|---|---|---|
101 10 530 | Mar 2001 | DE | national |
101 11 014 | Mar 2001 | DE | national |
PCT/EP01/06703 | Jun 2001 | WO | international |
101 29 237 | Jun 2001 | DE | national |
1115021 | Jun 2001 | EP | regional |
101 35 210 | Jul 2001 | DE | national |
101 35 211 | Jul 2001 | DE | national |
PCT/EP01/08534 | Jul 2001 | WO | international |
101 39 170 | Aug 2001 | DE | national |
101 42 231 | Aug 2001 | DE | national |
101 42 894 | Sep 2001 | DE | national |
101 42 903 | Sep 2001 | DE | national |
101 42 904 | Sep 2001 | DE | national |
101 44 732 | Sep 2001 | DE | national |
101 44 733 | Sep 2001 | DE | national |
101 45 792 | Sep 2001 | DE | national |
101 45 795 | Sep 2001 | DE | national |
101 46 132 | Sep 2001 | DE | national |
PCT/EP01/11299 | Sep 2001 | WO | international |
PCT/EP01/11593 | Oct 2001 | WO | international |
101 54 259 | Nov 2001 | DE | national |
101 54 260 | Nov 2001 | DE | national |
1129923 | Dec 2001 | EP | regional |
2001331 | Jan 2002 | EP | regional |
102 02 044 | Jan 2002 | DE | national |
102 02 175 | Jan 2002 | DE | national |
102 06 653 | Feb 2002 | DE | national |
102 06 856 | Feb 2002 | DE | national |
102 06 857 | Feb 2002 | DE | national |
102 07 224 | Feb 2002 | DE | national |
102 07 225 | Feb 2002 | DE | national |
102 07 226 | Feb 2002 | DE | national |
102 08 434 | Feb 2002 | DE | national |
102 08 435 | Feb 2002 | DE | national |
This application is a divisional of U.S. patent application Ser. No. 13/653,639, filed Oct. 17, 2012, now U.S. Pat. No. 9,075,605, which is a continuation of U.S. patent application Ser. No. 12/570,984, filed on Sep. 30, 2009, now U.S. Pat. No. 8,312,301, which is a continuation of U.S. patent application Ser. No. 12/257,075, filed on Oct. 23, 2008, now U.S. Pat. No. 8,099,618, which is a divisional of U.S. patent application Ser. No. 10/469,909, filed on Sep. 21, 2004, now U.S. Pat. No. 7,444,531, which is a national phase of Int. Pat. App. No. PCT/EP02/02402, filed on Mar. 5, 2002, which claims priority to German Patent Application Serial No. DE 101 10 530.4, filed on Mar. 5, 2001, the entire contents of each of which are expressly incorporated herein by reference thereto.
Number | Name | Date | Kind |
---|---|---|---|
3473160 | Wahlstrom et al. | Oct 1969 | A |
3531662 | Spandorfer et al. | Sep 1970 | A |
4020469 | Manning | Apr 1977 | A |
4412303 | Barnes et al. | Oct 1983 | A |
4454578 | Matsumoto et al. | Jun 1984 | A |
4539637 | DeBruler | Sep 1985 | A |
4577293 | Matick et al. | Mar 1986 | A |
4642487 | Carter | Feb 1987 | A |
4700187 | Furtek | Oct 1987 | A |
4706216 | Carter | Nov 1987 | A |
4722084 | Morton | Jan 1988 | A |
4724307 | Dutton et al. | Feb 1988 | A |
4748580 | Ashton et al. | May 1988 | A |
4758985 | Carter | Jul 1988 | A |
4768196 | Jou et al. | Aug 1988 | A |
4786904 | Graham, III et al. | Nov 1988 | A |
4791603 | Henry | Dec 1988 | A |
4837735 | Allen, Jr. et al. | Jun 1989 | A |
4862407 | Fette et al. | Aug 1989 | A |
4918440 | Furtek | Apr 1990 | A |
4959781 | Rubinstein et al. | Sep 1990 | A |
4967340 | Dawes | Oct 1990 | A |
5036473 | Butts et al. | Jul 1991 | A |
5055997 | Sluijter et al. | Oct 1991 | A |
5070475 | Normoyle et al. | Dec 1991 | A |
5081575 | Hiller et al. | Jan 1992 | A |
5103311 | Sluijter et al. | Apr 1992 | A |
5113498 | Evan et al. | May 1992 | A |
5119499 | Tonomura et al. | Jun 1992 | A |
5123109 | Hillis | Jun 1992 | A |
5144166 | Camarota et al. | Sep 1992 | A |
5197016 | Sugimoto et al. | Mar 1993 | A |
5212777 | Gove et al. | May 1993 | A |
5243238 | Kean | Sep 1993 | A |
5245227 | Furtek et al. | Sep 1993 | A |
RE34444 | Kaplinsky | Nov 1993 | E |
5261113 | Jouppi | Nov 1993 | A |
5287511 | Robinson et al. | Feb 1994 | A |
5296759 | Sutherland et al. | Mar 1994 | A |
5298805 | Garverick et al. | Mar 1994 | A |
5301340 | Cook | Apr 1994 | A |
5327570 | Foster et al. | Jul 1994 | A |
5336950 | Popli et al. | Aug 1994 | A |
5355508 | Kan | Oct 1994 | A |
5357152 | Jennings, III et al. | Oct 1994 | A |
5361373 | Gilson | Nov 1994 | A |
5386154 | Goetting et al. | Jan 1995 | A |
5386518 | Reagle et al. | Jan 1995 | A |
5394030 | Jennings, III et al. | Feb 1995 | A |
5408129 | Farmwald et al. | Apr 1995 | A |
5410723 | Schmidt et al. | Apr 1995 | A |
5412795 | Larson | May 1995 | A |
5421019 | Holsztynski et al. | May 1995 | A |
5426378 | Ong | Jun 1995 | A |
5430885 | Kaneko et al. | Jul 1995 | A |
5440711 | Sugimoto | Aug 1995 | A |
5448496 | Butts et al. | Sep 1995 | A |
5459846 | Hyatt | Oct 1995 | A |
5469003 | Kean | Nov 1995 | A |
5488582 | Camarota | Jan 1996 | A |
5500609 | Kean | Mar 1996 | A |
5502838 | Kikinis | Mar 1996 | A |
5504439 | Tavana | Apr 1996 | A |
5525971 | Flynn | Jun 1996 | A |
5572680 | Ikeda et al. | Nov 1996 | A |
5574930 | Halverson, Jr. et al. | Nov 1996 | A |
5581778 | Chin et al. | Dec 1996 | A |
5596743 | Bhat et al. | Jan 1997 | A |
5600597 | Kean et al. | Feb 1997 | A |
5608342 | Trimberger | Mar 1997 | A |
5619720 | Garde et al. | Apr 1997 | A |
5625836 | Barker et al. | Apr 1997 | A |
5631578 | Clinton et al. | May 1997 | A |
5635851 | Tavana | Jun 1997 | A |
5642058 | Trimberger et al. | Jun 1997 | A |
5646544 | Iadanza | Jul 1997 | A |
5646546 | Bertolet et al. | Jul 1997 | A |
5651137 | MacWilliams et al. | Jul 1997 | A |
5652529 | Gould et al. | Jul 1997 | A |
5656950 | Duong et al. | Aug 1997 | A |
5659785 | Pechanek et al. | Aug 1997 | A |
5671432 | Bertolet et al. | Sep 1997 | A |
5675262 | Duong et al. | Oct 1997 | A |
5675777 | Glickman | Oct 1997 | A |
5682491 | Pechanek et al. | Oct 1997 | A |
5685004 | Bruce et al. | Nov 1997 | A |
5687325 | Chang | Nov 1997 | A |
5696976 | Nizar et al. | Dec 1997 | A |
5701091 | Kean | Dec 1997 | A |
5705938 | Kean | Jan 1998 | A |
5715476 | Kundu et al. | Feb 1998 | A |
5721921 | Kessler et al. | Feb 1998 | A |
5734869 | Chen | Mar 1998 | A |
5742180 | DeHon et al. | Apr 1998 | A |
5748979 | Trimberger | May 1998 | A |
5752035 | Trimberger | May 1998 | A |
5761484 | Agarwal et al. | Jun 1998 | A |
5765009 | Ishizaka | Jun 1998 | A |
5774704 | Williams | Jun 1998 | A |
5778439 | Trimberger et al. | Jul 1998 | A |
5781756 | Hung | Jul 1998 | A |
5784636 | Rupp | Jul 1998 | A |
5805477 | Perner | Sep 1998 | A |
5808487 | Roy | Sep 1998 | A |
5812844 | Jones et al. | Sep 1998 | A |
5815004 | Trimberger et al. | Sep 1998 | A |
5828858 | Athanas et al. | Oct 1998 | A |
5832288 | Wong | Nov 1998 | A |
5857109 | Taylor | Jan 1999 | A |
5892962 | Cloutier | Apr 1999 | A |
5893165 | Ebrahim | Apr 1999 | A |
5894565 | Furtek et al. | Apr 1999 | A |
5898602 | Rothman et al. | Apr 1999 | A |
5905875 | Takahashi et al. | May 1999 | A |
5913925 | Kahle et al. | Jun 1999 | A |
5915123 | Mirsky et al. | Jun 1999 | A |
5933642 | Greenbaum et al. | Aug 1999 | A |
5943242 | Vorbach et al. | Aug 1999 | A |
5956518 | DeHon et al. | Sep 1999 | A |
5966534 | Cooke et al. | Oct 1999 | A |
5978583 | Ekanadham et al. | Nov 1999 | A |
5978830 | Nakaya et al. | Nov 1999 | A |
5990910 | Laksono et al. | Nov 1999 | A |
5991900 | Garnett | Nov 1999 | A |
6011407 | New | Jan 2000 | A |
6023564 | Trimberger | Feb 2000 | A |
6023742 | Ebeling et al. | Feb 2000 | A |
6034542 | Ridgeway | Mar 2000 | A |
6038646 | Sproull | Mar 2000 | A |
6049859 | Gliese et al. | Apr 2000 | A |
6052773 | DeHon et al. | Apr 2000 | A |
6058465 | Nguyen | May 2000 | A |
6075935 | Ussery et al. | Jun 2000 | A |
6076157 | Borkenhagen et al. | Jun 2000 | A |
6077315 | Greenbaum et al. | Jun 2000 | A |
6079008 | Clery, III | Jun 2000 | A |
6096091 | Hartmann | Aug 2000 | A |
6104696 | Kadambi et al. | Aug 2000 | A |
6108737 | Sharma et al. | Aug 2000 | A |
6119181 | Vorbach et al. | Sep 2000 | A |
6119219 | Webb et al. | Sep 2000 | A |
6122719 | Mirsky et al. | Sep 2000 | A |
6122720 | Cliff | Sep 2000 | A |
6124868 | Asaro et al. | Sep 2000 | A |
6128720 | Pechanek et al. | Oct 2000 | A |
6138198 | Garnett et al. | Oct 2000 | A |
6141734 | Razdan et al. | Oct 2000 | A |
6145072 | Shams et al. | Nov 2000 | A |
6148407 | Aucsmith | Nov 2000 | A |
6178494 | Casselman | Jan 2001 | B1 |
6209020 | Angle et al. | Mar 2001 | B1 |
6209065 | Van Doren et al. | Mar 2001 | B1 |
6215326 | Jefferson et al. | Apr 2001 | B1 |
6216174 | Scott et al. | Apr 2001 | B1 |
6219833 | Solomon et al. | Apr 2001 | B1 |
6226714 | Safranek et al. | May 2001 | B1 |
6226717 | Reuter et al. | May 2001 | B1 |
6237059 | Dean et al. | May 2001 | B1 |
6247036 | Landers et al. | Jun 2001 | B1 |
6263406 | Uwano et al. | Jul 2001 | B1 |
6286090 | Steely, Jr. et al. | Sep 2001 | B1 |
6289369 | Sundaresan | Sep 2001 | B1 |
6308191 | Dujardin et al. | Oct 2001 | B1 |
6314484 | Zulian et al. | Nov 2001 | B1 |
6321296 | Pescatore | Nov 2001 | B1 |
6321298 | Hubis | Nov 2001 | B1 |
6321373 | Ekanadham et al. | Nov 2001 | B1 |
6341318 | Dakhil | Jan 2002 | B1 |
6347346 | Taylor | Feb 2002 | B1 |
6374286 | Gee et al. | Apr 2002 | B1 |
6381687 | Sandstrom et al. | Apr 2002 | B2 |
6385672 | Wang et al. | May 2002 | B1 |
6405185 | Pechanek et al. | Jun 2002 | B1 |
6421757 | Wang et al. | Jul 2002 | B1 |
6425068 | Vorbach et al. | Jul 2002 | B1 |
6457100 | Ignatowski et al. | Sep 2002 | B1 |
6467009 | Winegarden et al. | Oct 2002 | B1 |
6501999 | Cai | Dec 2002 | B1 |
6522167 | Ansari et al. | Feb 2003 | B1 |
6526430 | Hung et al. | Feb 2003 | B1 |
6526461 | Cliff | Feb 2003 | B1 |
6538470 | Langhammer et al. | Mar 2003 | B1 |
6539438 | Ledzius et al. | Mar 2003 | B1 |
6571322 | Arimilli et al. | May 2003 | B2 |
6587939 | Takano | Jul 2003 | B1 |
6587961 | Garnett et al. | Jul 2003 | B1 |
6633181 | Rupp | Oct 2003 | B1 |
6643747 | Hammarlund et al. | Nov 2003 | B2 |
6658564 | Smith et al. | Dec 2003 | B1 |
6658578 | Laurenti et al. | Dec 2003 | B1 |
6665758 | Frazier et al. | Dec 2003 | B1 |
6708325 | Cooke et al. | Mar 2004 | B2 |
6757892 | Gokhale et al. | Jun 2004 | B1 |
6763327 | Songer et al. | Jul 2004 | B1 |
6795939 | Harris et al. | Sep 2004 | B2 |
6799265 | Dakhil | Sep 2004 | B1 |
6865662 | Wang | Mar 2005 | B2 |
6868476 | Rosenbluth et al. | Mar 2005 | B2 |
6871341 | Shyr | Mar 2005 | B1 |
6895452 | Coleman et al. | May 2005 | B1 |
6925641 | Elabd | Aug 2005 | B1 |
7000161 | Allen et al. | Feb 2006 | B1 |
7036106 | Wang et al. | Apr 2006 | B1 |
7043416 | Lin | May 2006 | B1 |
7100061 | Halepete | Aug 2006 | B2 |
7188234 | Wu et al. | Mar 2007 | B2 |
7210129 | May et al. | Apr 2007 | B2 |
7266725 | Vorbach et al. | Sep 2007 | B2 |
7340596 | Crosland et al. | Mar 2008 | B1 |
7581076 | Vorbach | Aug 2009 | B2 |
7924837 | Shabtay et al. | Apr 2011 | B1 |
7928763 | Vorbach | Apr 2011 | B2 |
7933838 | Ye | Apr 2011 | B2 |
8156284 | Vorbach et al. | Apr 2012 | B2 |
8463835 | Walke | Jun 2013 | B1 |
9047440 | Vorbach et al. | Jun 2015 | B2 |
9099460 | Cho et al. | Aug 2015 | B2 |
9147623 | Lua et al. | Sep 2015 | B2 |
20010003834 | Shimonishi | Jun 2001 | A1 |
20010032305 | Barry | Oct 2001 | A1 |
20020004916 | Marchand et al. | Jan 2002 | A1 |
20020010840 | Barroso et al. | Jan 2002 | A1 |
20020145545 | Brown | Oct 2002 | A1 |
20030014743 | Cooke et al. | Jan 2003 | A1 |
20030033514 | Appleby-Allis et al. | Feb 2003 | A1 |
20030046530 | Poznanovic | Mar 2003 | A1 |
20030101307 | Gemelli et al. | May 2003 | A1 |
20030120904 | Sudharsanan et al. | Jun 2003 | A1 |
20040093186 | Ebert et al. | May 2004 | A1 |
20050080994 | Cohen et al. | Apr 2005 | A1 |
20050257179 | Stauffer et al. | Nov 2005 | A1 |
20060036988 | Allen et al. | Feb 2006 | A1 |
20060095716 | Ramesh | May 2006 | A1 |
20060259744 | Matthes | Nov 2006 | A1 |
20070043965 | Mandelblat et al. | Feb 2007 | A1 |
20070050603 | Vorbach et al. | Mar 2007 | A1 |
20070143577 | Smith | Jun 2007 | A1 |
20070143578 | Horton et al. | Jun 2007 | A1 |
20100153654 | Vorbach et al. | Jun 2010 | A1 |
20110060942 | Vorbach | Mar 2011 | A1 |
20110145547 | Vorbach | Jun 2011 | A1 |
20120017066 | Vorbach et al. | Jan 2012 | A1 |
20140297914 | Vorbach | Oct 2014 | A1 |
20140297948 | Vorbach et al. | Oct 2014 | A1 |
Number | Date | Country |
---|---|---|
4416881 | Nov 1994 | DE |
10028397.1 | Dec 2001 | DE |
WO9525306 | Sep 1995 | WO |
WO9528671 | Oct 1995 | WO |
Entry |
---|
File History of U.S. Appl. No. 08/388,230. |
File History of U.S. Appl. No. 60/010,317. |
File History of U.S. Appl. No. 60/022,131. |
Chan, Pak K. , “A Field-Programmable Prototyping Board: XC4000 BORG User's Guide”, University of California, Santa Cruz (Apr. 1994). |
Schewel, John , “A Hardware/Software Co-Design System Using Configurable Computing Technology”. |
Hartenstein, Reiner W. et al. , “A New FPGA Architecture for Word-Oriented Datapaths”, Lecture Notes in Computer Science, vol. 849 (1994). |
Knittel, Guntar , “A PCI-Compatible FPGA-Coprocessor for 2D/3D Image Processing”, IEEE 1996. |
Schue, Rick , “A Simple DRAM Controller for 25/16 MHz i960® CA/CF Microprocessors”, Intel Corporation, Application Note AP•704 (Feb. 20, 1995). |
Alike, Peter and New, Bernie , “Additional XC3000 Data”, Xilinx, Inc., Xilinx Application Note, XAPP024.000 (1994). |
Altera Corporation , “Altera 1996 Data Book”, Altera Corporation (Jun. 1996). |
Altera Corporation , “Altera Applications Handbook”, Altera Corporation (Apr. 1992). |
Electronic Engineering , “Altera puts memory into its FLEX PLDs”, Electronic Engineering Times, Issue 840, Mar. 20, 1995. |
ARM , “AMBA: Advanced Microcontroller Bus Architecture Specification”, Advanced RISC Machines, Ltd., Document No. ARM IHI 0001C, Sep. 1995. |
Margolus, Norman , “An FPGA architecture for DRAM-based systolic computations”, The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Apr. 16, 1997). |
Krishnamohan, K. , “Applying Rambus Technology to Desktop Computer Main Memory Subsystems, Version 1.0”, Rambus Inc. (Mar. 1992). |
New, Bernie , “Boundary-Scan Emulator for XC3000”, Xilinx, Inc.,Xilinx Application Note, XAPP007.001 (1994). |
New, Bernie , “Bus-Structured Serial Input/Output Device”, Xilinx Application Note, XAPP010.001 (1994). |
Berkeley Design Technology Group , “Buyer's Guide to DSP Processors”, Berkeley Design Technology Group (1995). |
Algotronix, Ltd. , “CAL 4096 Datasheet”, Algotronix, Ltd. (1992). |
Algotronix, Ltd. , “CAL 64K Datasheet”, Algotronix, Ltd. (Apr. 6, 1989). |
Algotronix, Ltd. , “CHS2x4 User Manual”, Algotronix, Ltd. (1991). |
Altera Corporation , “ClockLock & ClockBoost Circuitry for High-Density PLDS”, The Altera Advantage News & Views, Newsletter for Altera Customers, Third Quarter, Aug. 1996. |
Altera Corporation , “Configuring FLEX 10K Devices”, Altera Corporation, Dec. 1995, ver. 1, Application Note 59. |
Schmidt, Ulrich, and Knut, Cesar, “Datawave: A Single-Chip Multiprocessor for Video Applications”, IEEE Micro (1991). |
Electronic Design, “Embedded Configurable Memory and Logic Boost FPGA Functionality”, Electronic Design, vol. 43, No. 14, Jul. 10, 1995. |
Xilinix, Inc., “Fully Compliant PCI Interface in an XC3164A-2 FPGA”, Xilinix, Inc. Application Note (Jan. 1995). |
Epstein, Dave, “IBM Extends DSP Performance with Mfast”, Microprocessor Reports, vol. 9, No. 16 (Dec. 4, 1995). |
IEEE, “IEEE Standard Test Access Port and Boundary-Scan Architecture”, IEEE STD 1149.1 Approved Feb. 15, 1990. |
Alfke, Peter and New, Bernie, “Implementing State Machines in LCA Devices”, Xilinx, Inc., Xilinx Application Note, XAPP027.001 (1994). |
Camilleri, Nick, and Lockhard, Chris, “Improving XC4000 Design Performance”, Xilinx Application Note, XAPP043.000 (1994). |
Intel Corporation, “Intel 82375EB/82375SB PCI-EISA Bridge (PCEB) Advance Information”, Intel Corporation (Mar. 1996). |
Wilkie, Bill, “Interfacing XC6200 to Microprocessors (MC68020 Example)”, Xilinx Application Note, XAPP 063, v. 1.1 (Oct. 9, 1996). |
Wilkie, Bill, “Interfacing XC6200 to Microprocessors (TMS320C50 Example)”, Xilinx Application Note, XAPP064 (Oct. 9, 1996). |
XCELL, “Introducing the XC6200 FPGA Architecture: The First FPGA Architecture Optimized for Coprocessing in Embedded System Applications”, XCELL, Iss. 18, 3d Quarter, 1995. |
Altera Corporation, “JTAG Boundary—Scan Testing in Altera Devices”, Altera Corporation, Nov. 1995, ver. 3, Application Note 39. |
Margolus, Norman, “Large-scale logic-array computation”, Boston University Center for Computational Science, SPIE vol. 2914 (May 1996). |
Alfke, Peter , “Megabit FIFO in Two Chips: One LCA Device and One DRAM”, Xilinx Application Note, XAPP030.000 (1994). |
del Corso, D. et al. , “Microcomputer Buses and Links”, Academic Press (1996). |
Bakkes, P.J. and du Plessis, J.J. , “Mixed Fixed and Reconfigurable Logic for Array Processing”, IEEE (1996). |
Altera Corporation , “PCI Bus Applications in Altera Devices”, Altera Corporation, Apr. 1995, ver. 1, Application Note 41. |
Altera Corporation , “PCI Bus Target Megafunction”, Altera Corporation, Solution Brief 6, ver. 1, Nov. 1996. |
Altera Corporation , “PCI Compliance of Altera Devices”, Altera Corporation, May 1995, ver. 2, Application Brief 140. |
SIG , “PCI Local Bus Specification”, PCI Special Interest Group, Production Version, Revision 2.1 (Jun. 1, 1995). |
Rambus Inc. , “Rambus Architectural Overview”, Rambus Inc. (1992). |
Rambus Inc. , “Rambus FPGA Proposal”, Rambus Inc. (Jan. 4, 1994). |
Rambus Inc. , “Rambus Product Catalog”, Rambus Inc. (1993). |
Xilinx, Inc. , “Series 6000 User Guide”, Xilinx, Inc. (1997). |
Cartier, Lois , “System Design with New XC4000EX I/O Features”, Xilinx Application Note, XAPP056 (Feb. 21, 1996). |
Xilinx, Inc. , “Technical Data—XC5200 Logic Cell Array Family, Preliminary, v.1.0”, Xilinx, Inc., (Apr. 1995). |
Xilinx, Inc. , “The Programmable Logic Data Book (1993)”, Xilinx, Inc. (1993). |
New, Bernie , “Ultra-Fast Synchronous Counters”, Xilinx Application Note, XAPP 014.001 (1994). |
Bolotski, Michael, DeHon, André, and Knight, Thomas , “Unifying FPGAs and SIMD Arrays”, 2nd International Workshop on Field-Programmable Gate Arrays, Feb. 13-15, 1994. |
Knapp, Steven K. , “Using Programmable Logic to Accelerate DSP Functions”, Xilinx, Inc. (1995). |
New, Bernie , “Using the Dedicated Carry Logic in XC4000”, Xilinx Application Note, Xapp 013.001 (1994). |
Iwanczuk, Roman , “Using the XC4000 RAM Capability”, Xilinx Application Note, XAPP 031.000 (1994). |
“IEEE Workshop on FPGAs for Custom Computing Machines”, IEEE Computer Society Technical Committee on Computer Architecture, Apr. 10-13, 1994. |
Nobuyuki Yamashita, et.al. , “A 3.84 GIPS Integrated Memory Array Processor with 64 Processing Elements and a 2-Mb SRAM”, IEEE Journal of Solid-State Circuits, vol. 29, Nov. 1994. |
Athanas, Peter , “Fun with the XC6200, Presentation at Cornell University”, Cornell University (Oct. 1996). |
Achour, C. , “A Multiprocessor Implementation of a Wavelet Transforms”, Proceedings on the 4th Canadian Workshop on Field-Programmable Devices, May 13-14, 1996. |
Electronic Engineering Times , “Altera ships 100,000-gate EPLD”, Electronic Engineering Times, Issue 917, Sep. 2 20, 1996. |
Altera Corporation , “Chipdata, Database Information for z1120a”, Altera Corporation, Sep. 11, 2012. |
Altera Corporation , “Embedded Programmable Logic Family Data Sheet”, Altera Corporation, Jul. 1995, ver. 1. |
Altera Corporation , “FLEX 10K 100, 000-Gate Embedded Array Programmable Logic Family”, Altera Advantage News & Views, Newsletter for Altera Customers, Second Quarter, May 1995. |
Altera Corporation , “Implementing Multipliers in FLEX 10K Devices”, Altera Corporation, Mar. 1996, ver. 1, Application Note 53. |
Intel 82375EB/82375SB PCI-EISA Bridge (PCEB) Advance Information, Xilinx Application Note, XAPP 063, v. 1.1 (Oct. 9, 1996). |
Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, IEEE Computer Society Technical Committee on Computer Architecture, Apr. 19, 1995. |
Proceedings of the Parallel Systems Fair, The International Parallel Processing Symposium, IEEE Computer Society Technical Committee for Parallel Processing, Apr. 27, 1994. |
Proceedings of the Workshop on Reconfigurable Architectures, 8th International Parallel Processing Symposium, IEEE Computer Society, Apr. 26, 1994. |
The Programmable Logic Conference & Exhibit Proceedings, Electronic Engineering Times, Apr. 25-27, 1995. |
Britton, Barry K. et al. , “Optimized Reconfigurable Cell Array Architecture for High-Performance Field Programmable Gate Arrays”, IEEE Custom Integrated Circuits Conference 1993. |
Landers, George , “Special Purpose Processor Speeds up DSP Functions, Reconfigurable Arithmetic Datapath Device”, Professional Program Proceedings, Electro Apr. 30-May 2, 1996. |
Proceedings of the Third Workshop on Reconfigurable Architectures, at Sheraton Waikiki Hotel, Honolulu, Hawai, Apr. 15, 1996. |
Proceedings of the Third Workshop on Reconfigurable Architectures, at Sheraton Waikiki Hotel, Honolulu, Hawaii, Apr. 15, 1996. |
Atmel Corporation, “Configurable Logic Design and Application Book 1993-1994—PLD, PFGA, Gate Array”, 1993. |
Atmel Corporation, “Configurable Logic Design and Application Book 1994-1995—PLD, PFGA, Gate Array”, 1994. |
N. Wirth, “An Extension-Board with an FPGA for Experimental Circuit Design”, ETH Zurich, Department Informatik, Jul. 1993. |
F. Furtek et al. “Labyrinth: A Homogeneous Computational Medium”, IEEE 1990 Custom Integrated Circuits Conference, 1990. |
Altera Corporation , “Altera 1998 Data Book”, Altera Corporation (Jan. 1998). |
Altera Corporation, “FLEX 10K—Embedded Programmable Logic Family”, Data Sheet, ver.3.13, Oct. 1998. |
Altera Corporation, “Implementing RAM Functions in FLEX 10K Devices”, Application Note 52, Ver. 1, Nov. 1995. |
Altera Corporation , “Altera 1993 Data Book”, Altera Corporation (Aug. 1993). |
Altera Corporation , “Altera 1995 Data Book”, Altera Corporation (Mar. 1995). |
Altera Corporation, “User-Configurable Microprocessor Peripheral EPB1400”, Rev. 1.0, 1987. |
Altera Corporation, “EPB2001—Card Interface Chip for PS/2 Micro Channel”, Data Sheet, Dec. 1989. |
Altera Corporation, “FLEX 8000 Handbook”, May 1994. |
Silberschatz and Galvin, Operating System Concepts, 1998, Addison Wesley, 5th edition, ISBN 0-201-59113-8, 31 pages. |
Altera FLEX 10K Embedded Programmable Logic Family Data Sheet; Oct. 1998, ver. 3.13; pp. 1-21. |
Hauser, John Reid, Augmenting a Microprocessor with Reconfigurable Hardware, University of California, Berkeley, Fall 2000. |
Katherine Compton, Scott Hauck, Reconfigurable computing: a survey of systems and software, ACM Computing Surveys (CSUR), v.34 n.2, p. 171-21 0, Jun. 2002. |
Goldberg D: “What Every Computer Scientist Should Know About Floating-Point Arithmetic”, ACM Computing Surveys, ACM, New York, NY, US, US, vol. 23, No. 1, Mar. 1, 1991 (Mar. 1, 1991), pp. 5-48. |
Hauser et al. “Garp: A MIPS Processor with a Reconfigurable Coprocessor”, Apr. 1997, pp. 12-21. |
Libo Huang et al: “A New Architecture for Multiple-Precision Floating-Point Multiply-Add Fused Unit Design” Computer Arithmetic, 2007. ARITH '07. 18th IEEE Symposium on, IEEE, PI, Jun. 1, 2007 (Jun. 1, 2007), Seiten 69-76. |
Manhwee Jo et al: “Implementation of floating-point operations for 3D graphics on a coarse-grained reconfigurable architecture” SOC Conference, 2007 IEEE International, IEEE, Piscataway, NJ, USA, Sep. 26, 2007 (Sep. 26, 2007), Seiten 127-130. |
Mirsky E. et al., “MATRIX: A Reconfigurable Computing Architecture with Configurable Instruction Distribution and Deployable Resources”, 1996, IEEE, pp. 157-166. |
Shirazi et al., “Quantitative analysis of floating point arithmetic on FPGA based custom computing machines,” IEEE Symposium on FPGAs for Custom Computing Machines, I EEE Computer Society Press, Apr. 19-21, 1995, pp. 155-162. |
Vermeulen et al., Silicon Debug of a Co-Processor Array for Video Applications, 2000, IEEExplore, 0-7695-0786-7/00, pp. 47-52, [retrieved on Feb. 1, 2015], retrieved from URL http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=889558&tag=1>. |
Xilinx, Inc. , “The Programmable Logic Data Book (1994)”, Xilinx, Inc. (1994). |
Xilinx, Inc. , “The Programmable Logic Data Book (1996)”, Xilinx, Inc. (Jan. 1996). |
Churcher, Stephen et al. , “The XC6200 FastMap Processor Interface”, FPL (Aug. 1995). |
Texas Instruments Incorporated , “TMS320C80 (MVP) Parallel Processor User's Guide”, Texas Instruments Incorporated (1995). |
Texas Instruments Incorporated , “TMS320C8x System-Level Synopsis”, Texas Instruments Incorporated (Sep. 1995). |
Xilinx, Inc. , “XC6200 Field Programmable Gate Arrays, Advance Product Specification, v. 1.0, Jun. 1, 1996”, Xilinx, Inc. (Jun. 1, 1996). |
Xilinx, Inc. , “Xilinx XC6200 Field Programmable Gate Arrays, Product Specification, v.1.10, Apr. 24, 1997”, Xilinx, Inc. (Apr. 24, 1997). |
Altera Corporation , “Programmable Peripheral Interface Adapter a8255, Sep. 1996, ver. 1”, Altera Corporation, Sep. 1996, ver. 1. |
Altera Corporation , “Universal Asynchronous Receiver/Transmitter a16450, Sep. 1996, ver. 1”, Altera Corporation, Sep. 1996, ver. 1. |
Altera Corporation , “Asynchronous Communications Interface Adapter a6850, Sep. 1996, ver. 1”, Altera Corporation, Sep. 1996, ver. 1. |
Schmit, Herman et al. , “Behavioral Synthesis for FPGA-based Computing”, IEEE (1994). |
Allaire, Bill and Knapp, Steve , “A Plug and Play Interface Using Xilinx FPGAs”, Xilinx, Inc. (May 1995). |
Goslin, Greg and Newgard, Bruce , “16-Tap, 8-Bit FIR Filter Applications Guide”, Xilinx Application Note v. 1.01 (Nov. 21, 1994). |
Veendrick, H. , “A 1.5 GIPS Video Signal Processor (VSP)”, IEEE 1994 Custom Integrated Circuits Conference (1994). |
Yeung, Alfred K. and Rabaey, Jan M. , “A 2.4GOPS Data-Driven Reconfigurable Multiprocessor IC for DSP”, IEEE International Solid-State Circuits Conference (1995). |
Duncan, Ann , “A 32×16 Reconfigurable Correlator for the XC6200”, Xilinx Application Note, XAPP 084, v. 1.0 (Jul. 25, 1997). |
Yeung, Kwok Wah , “A Data-Driven Multiprocessor Architecture for High Throughput Digital Signal Processing”, U.C. Berkeley (Jul. 10, 1995). |
Koren, Israel et al. , “A Data-Driven VLSI Array for Arbitrary Algorithms”, IEEE (1988). |
Xilinx, Inc. , “A Fast Constant Coefficient Multiplier”, Xilinx, Inc., Xilinx Application Note, XAPP 082, v. 1.0 (Aug. 24, 1997). |
Sutton, Roy A. et al. , “A Multiprocessor DSP System Using PADDI-2”, U.C. Berkeley (1998). |
Chen, Dev C. and Rabaey, Jan M. , “A Reconfigurable Multiprocessor IC for Rapid Prototyping of Algorithmic-Specific High-speed DSP Data Paths”, IEEE Journal of Solid State Circuits (Dec. 1992). |
Minnick, Robert , “A Survey of Microcellular Research”, J. of the Association for Computing Machinery, vol. 14, No. 2 (Apr. 1967). |
Trimberger, Steve et al. , “A Time-Multiplexed FPGA”, IEEE (1997). |
New, Bernie , “Accelerating Loadable Counters in XC4000”, Xilinx Application Note, XAPP 023.001 (1994). |
Athanas, Peter , “An Adaptive Machine Architecture and Compiler for Dynamic Processor Reconfiguration”, Brown University (May 1992). |
Atmel Corporation , “Application Note AT6000 Series Configuration”, Published in May 1993. |
Agarwal, Anant et al. , “APRIL: A Processor Architecture for Multiprocessing”, IEEE (1990). |
Allaire, Bill and Fischer, Bud , “Block Adaptive Filter”, Xilinx Application Note, XAPP 055, v. 1.0 (Aug. 15, 1996). |
Bittner, Jr., Ray A. et al. , “Colt: An Experiment in Wormhole Run-Time Reconfiguration”, Proc. of SPIE, vol. 2914 (Oct. 21, 1996). |
New, Bernie , “Complex Digital Waveform Generator”, Xilinx Application Note, XAPP 008.002 (1994). |
Alfke, Peter , “Dynamic Reconfiguration”, Xilinx Application Note, XAPP 093, v. 1.1 (Nov. 10, 1997). |
Canadian Microelectronics Corp , “Field-Programmable Devices”, 1994 Canadian Workshop on Field-Programmable Devices, Jun. 13-16, 1994, Kingston, Ontario. |
S. Brown et al., Published by Kluwer Academic Publishers , “Field Programmable Gate Arrays”, Atmel Corporation, 1992. |
Atmel Corporation , “Field Programmable Gate Arrays, AT6000 Series”, Atmel Corporation, 1993. |
International Society for Optical Engineering, “Field Programmable Gate Arrays (FPGAs) for Fast Board Development and Reconfigurable Computing”, International Society for Optical Engineering, vol. 2607, Oct. 25-26, 1995. |
Trimberger, Stephen M., “Field-Programmable Gate Array Technology”, Kluwer Academic Publishers (1994). |
Hartenstein, Reiner Servit, Michal (Eds.) , “Field-Programmable Logic—Architectures, Synthesis and Applications”, 4th Intl Workshop on Field-Programmable Logic and Applications, FPL '94, Prague, Czech Republic, Sep. 7-9, 1994. |
IEEE Computer Society , “FPGAs for Custom Computing Machines”, FCCM '93, IEEE Computer Society, Apr. 5-7, 1993. |
Cowie, Beth , “High Performance, Low Area, Interpolator Design for the XC6200”, Xilinx Application Note, XAPP 081, v. 1.0 (May 7, 1997). |
IEEE Computer Society Technical Committee on Computer Architecture , “IEEE Symposium on FPGAs for Custom Computing Machines”, IEEE Computer Society Technical Committee on Computer Architecture, Apr. 19-21, 1995. |
B. Schoner, C. Jones and J. Villasenor , “Issues in wireless video coding using run-time-reconfigurable FPGAs”, Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (Apr. 19, 1995). |
Moore, Will and Luk, Wayne , “More FPGAs”, Abingdon EE&CS Books (1994). |
Fawcett, Bradly K. , “New SRAM-Based FPGA Architectures Address New Applications”, IEEE (Nov. 1995). |
Department of Electrical and Computer Engineering, The University of Toronto , “Proceedings of the 4th Canadian Workshop on Field-Programmable Devices”, Proceedings of the 4th Canadian Workshop on Field-Programmable Devices, Department of Electrical and Computer Engineering, The University of Toronto, May 13-14, 1996. |
Chen, Devereaux C. , “Programmable Arithmetic Devices for High Speed Digital Signal Processing”, U.C. Berkeley (1992). |
Vasell, Jasper, et al. , “The Function Processor: A Data-Driven Processor Array for Irregular Computations”, Future Generations Computer Systems, vol. 8, Issue 4 (Sep. 1992). |
T. Korpiharju, J. Viitanen, H. Kiminkinen, J. Takala, K. Kaski , “TUTCA configurable logic cell array architecture”, IEEE (1991). |
Number | Date | Country | |
---|---|---|---|
20140208143 A1 | Jul 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13653639 | Oct 2012 | US |
Child | 14219945 | US | |
Parent | 10469909 | US | |
Child | 12257075 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12570984 | Sep 2009 | US |
Child | 13653639 | US | |
Parent | 12257075 | Oct 2008 | US |
Child | 12570984 | US |