The present invention is related in general to memory controllers and more specifically to the design of a memory controller for use in an adaptive computing environment.
The advances made in the design and development of integrated circuits (“ICs”) have generally produced information-processing devices falling into one of several distinct types or categories having different properties and functions, such as microprocessors and digital signal processors (“DSPs”), application specific integrated circuits (“ASICs”), and field programmable gate arrays (“FPGAs”). Each of these different types or categories of information-processing devices have distinct advantages and disadvantages.
Microprocessors and DSPs, for example, typically provide a flexible, software-programmable solution for a wide variety of tasks. The flexibility of these devices requires a large amount of instruction decoding and processing, resulting in a comparatively small amount of processing resources devoted to actual algorithmic operations. Consequently, microprocessors and DSPs require significant processing resources, in the form of clock speed or silicon area, and consume significantly more power compared with other types of devices.
ASICs, while having comparative advantages in power consumption and size, use a fixed, “hard-wired” implementation of transistors to implement one or a small group of highly specific tasks. ASICs typically perform these tasks quite effectively; however, ASICs are not readily changeable, essentially requiring new masks and fabrication to realize any modifications to the intended tasks.
FPGAs allow a degree of post-fabrication modification, enabling some design and programming flexibility. FPGAs are comprised of small, repeating arrays of identical logic devices surrounded by several levels of programmable interconnects. Functions are implemented by configuring the interconnects to connect the logic devices in particular sequences and arrangements. Although FPGAs can be reconfigured after fabrication, the reconfiguring process is comparatively slow and is unsuitable for most real-time, immediate applications. Additionally, FPGAs are very expensive and very inefficient for implementation of particular functions. An algorithmic operation implemented on an FPGA may require orders of magnitude more silicon area, processing time, and power than its ASIC counterpart, particularly when the algorithm is a poor fit to the FPGA's array of homogeneous logic devices.
An adaptive computing engine (ACE) or adaptable computing machine (ACM) allows a collection of hardware resources to be rapidly configured for different tasks. Resources can include, e.g., processors, or nodes, for performing arithmetic, logical and other functions. The nodes are provided with an interconnection system that allows communication among nodes and communication with resources such as memory, input/output ports, etc. One type of valuable processing is memory access services. In order to provide memory access services to access external memory, an external memory controller is typically needed.
Thus, there is a desire to provide a memory controller that provides memory access services in an adaptive computing engine.
Embodiments of the present invention generally relate to using a memory controller to provide memory access services in an adaptive computing engine.
In one embodiment, a memory controller in an adaptive computing engine (ACE) is provided. The controller includes a network interface configured to receive a memory request from a programmable network; and a memory interface configured to access a memory to fulfill the memory request from the programmable network, wherein the memory interface receives and provides data for the memory request to the network interface, the network interface configured to send data to and receive data from the programmable network.
In another embodiment, a memory controller includes a network interface configured to receive a memory request for a memory access service from a network; and one or more engines configured to receive the memory request and to provide the memory access service associated with the memory request.
In yet another embodiment, a memory controller includes one or more ports configured to receive memory requests, wherein each port includes one or more parameters; an engine configured to receive a memory request from a port in the one or more ports; and a data address generator configured to generate a memory location for a memory based on the one or more parameters associated with the port, wherein the engine is configured to perform a memory operation for the memory request using the generated memory location.
In another embodiment, a memory controller includes one or more ports configured to receive memory requests from requesting nodes, wherein each port includes one or more parameters, the one or more parameters configurable by information in the memory requests; a point-to-point engine configured to receive a memory request from a port in the one or more ports; a data address generator configured to generate a memory location for a memory based on the one or more parameters associated with the port, wherein the point-to-point engine performs a memory operation using the generated memory location while adhering to a point-to-point protocol with the requesting node.
In another embodiment, a system for processing memory service requests in an adaptable computing environment is provided. The system comprises: a memory; one or more nodes configured to generate a memory service request; a memory controller configured to receive the memory service request, the memory controller configured to service the memory service request, wherein the memory controller reads or writes data from or to the memory based on the memory service request.
A further understanding of the nature and the advantages of the inventions disclosed herein may be realized by reference of the remaining portions of the specification and the attached drawings.
A preferred embodiment of the invention uses an adaptive computing engine (ACE) architecture including an external memory controller (XMC) node. Details of an exemplary ACE architecture are disclosed in the U.S. patent application Ser. No. 09/815,122, entitled “Adaptive Integrated Circuitry with Heterogeneous and Reconfigurable Matrices of Diverse and Adaptive Computational Units having Fixed, Application Specific Computational Elements,” referenced, above.
In general, the ACE architecture includes a plurality of heterogeneous computational elements coupled together via a programmable interconnection network.
A plurality of heterogeneous computational elements (or nodes), including computing elements 120, 122, 124, and 126, comprise fixed and differing architectures corresponding to different algorithmic functions. Each node is specifically adapted to implement one of many different categories or types of functions, such as internal memory, logic and bit-level functions, arithmetic functions, control functions, and input and output functions. The quantity of nodes of differing types in an ACE device can vary according to the application requirements.
Because each node has a fixed architecture specifically adapted to its intended function, nodes approach the algorithmic efficiency of ASIC devices. For example, a binary logical node may be especially suited for bit-manipulation operations such as, logical AND, OR, NOR, XOR operations, bit shifting, etc. An arithmetic node may be especially well suited for math operations such as addition, subtraction, multiplication, division, etc. Other types of nodes are possible that can be designed for optimal processing of specific types.
Programmable interconnection network 110 enables communication among a plurality of nodes such as 120, 122, 124 and 126, and interfaces 102, 104, 106, and 108. The programmable interconnection network can be used to reconfigure the ACE device for a variety of different tasks. For example, changing the configuration of the interconnections between nodes can allow the same set of heterogeneous nodes to implement different functions, such as linear or non-linear algorithmic operations, finite state machine operations, memory operations, bit-level manipulations, fast-Fourier or discrete-cosine transformations, and many other high level processing functions for advanced computing, signal processing, and communications applications.
In one embodiment, programmable interconnection network 110 comprises a network root 130 and a plurality of crosspoint switches, including switches 132 and 134. In one embodiment, programmable interconnection network 110 is logically and/or physically arranged as a hierarchical tree to maximize distribution efficiency. In this embodiment, a number of nodes can be clustered together around a single crosspoint switch. The crosspoint switch is further connected with additional crosspoint switches, which facilitate communication between nodes in different clusters. For example, cluster 112, which comprises nodes 120, 122, 124, and 126, is connected with crosspoint switch 132 to enable communication with the nodes of clusters 114, 116, and 118. Crosspoint switch is further connected with additional crosspoint switches, for example crosspoint switch 134 via network root 130, to enable communication between any of the plurality of nodes in ACE device 100.
The programmable interconnection network (PIN) 110, in addition to facilitating communications between nodes within ACE device 100, also enables communication with nodes within other ACE devices via network inputs and outputs interfaces 104 and 108, respectively, and with other components and resources through other interfaces such as 102 and 106.
In accordance with embodiments of the present invention, a memory controller is used to provide memory access services in an ACE architecture.
Nodes 301 can be any nodes, (e.g., computational elements or resources) in a computing device. Nodes 301 initiate memory service requests to memory controller 302. For example, nodes 301 can initiate read and write commands. If a read command is initiated, the requesting node is considered a “consumer” in that it consumes data read from memory 304 and if a write command is initiated, the requesting node is considered a “producer” in that it produces data to be written to memory 304. The read and write commands may be in the form of different memory access services that are described below.
PIN 110 receives memory service requests from nodes 301 in the ACE device. Additionally, PIN 110 receives and/or sends data from/to memory controller 302 and receives and/or sends the data from/to the requesting nodes in the ACE device.
Memory controller 302 receives memory access service requests from PIN 110 and processes the requests accordingly. In one embodiment, the services provided by memory controller 302 include a peek and poke service, a memory random access (MRA) service, a direct memory access (DMA) service, a point-to-point (PTP) service, a real-time input (RTI) service and a message service. The peek and poke service allows a requesting node to peek (retrieve) data and poke (write) data found in memory controller 302. A memory random access (MRA) service allows a requesting node to do a read and write to memory 304. A direct memory access (DMA) service allows a requesting node to request large blocks of data from memory 304. A point-to-point (PTP) service allows a requesting node to read and write data, and update port parameters, in a process that conforms to a point-to-point protocol. In one embodiment, the PTP service is used to read and write real-time streaming data. The real-time input (RTI) service performs the same service as to PTP service but uses a reduced acknowledgement protocol. Additionally, memory controller 304 provides messaging to nodes in the ACE device. For example, memory controller 302 can provide confirmation acknowledgement messages to requesting nodes that may be used for flow control.
In one embodiment, memory 304 is an external memory for an ACE device. Memory 304 receives memory service requests from memory controller 302 and provides data to memory controller 302 when a read operation is requested. Additionally, memory controller 302 may provide data to memory 304 that is to be written to memory 304. Memory 304 may be any memory, such as, a synchronous dynamic random access memory (SDRAM), a flash memory, static random access memory (SRAM) and the like.
The above-mentioned services that may be provided by memory controller 302 will now be described. Although the following memory services are described, it will be understood that a person skilled in the art will appreciate other memory services that memory controller 302 may provide.
Flow control is provided for a poke request in that a requesting poke waits for a poke acknowledgement before initiating a new poke to the same memory. In the case where multiple services are provided in memory 304, multiple requests to different memories may be allowed.
PIN interface 400 is configured to receive memory service requests from PIN 110. Additionally, PIN interface 400 is configured to send data or any other messages to PIN 110. In one embodiment, PIN interface 400 includes a distributor, input arbiter, and an aggregator. The distributor and arbiter facilitate distributing data to one or more engines 402. The aggregator aggregates words that will be sent to nodes. When a request is received at PIN interface 400, PIN interface 400 determines which engine in engines 402 to send the request to.
In one embodiment, PIN interface 400 also provides a priority system for memory service requests. For example, one memory priority system may give a peek/poke memory service request the highest priority. Random read requests that are received with a fast track or higher priority indication are then given the next highest priority. All other requests are given a lowest priority. For example, random memory access requests are placed on a 132 entry first come first serve queue, DMA and PTP requests are placed on a single 64 entry first come first serve queue and these two queues are serviced on a round robin basis.
As shown, one or more engines 402 includes a peek/poke engine 410, a fast track engine 412, a random read/write engine 414, and a PTP/DMA/RTI engine 416 according to one embodiment of the invention. Although these engines 402 are described, a person skilled in the art will appreciate that other engines may be provided to perform functions related to the memory access services. Engines 402 process a memory service request and provide the appropriate request to memory interface 404 to fulfill the memory service request. For example, engines 402 determine a memory address that data should be read from in memory 304 or the data and a memory address in which data should be written to in memory 304. The action is then performed according to a protocol associated with the memory service request.
Memory interface 404 receives memory service requests from memory interface 404 and provides them to SDRAM memory 406 and/or flash memory 408. Although SDRAM memory 406 and flash memory 408 are shown, it will be understood that a person skilled in the art will appreciate other memories that may be used.
The types of services that are provided by engines 402 will now be described.
When a peek memory service request is received at PIN interface 400, it determines that the request should be sent to peek/poke engine 410. The peek request is received in one or more data words and PIN interface 400 is configured to determine from data in the data words that a peek should be performed. The peek request is then forwarded to peek/poke engine 410, which determines peek address(es) that data should be read from. In one embodiment, peek requests are used to read data from memory or registers found in controller 302. For example, registers storing parameters 422 in ports 418 may be peeked. The data request at the determined address(es) is then sent to appropriate registers. The data is then returned to peek/poke engine 410 and sent to the requesting node through PIN interface 400 and PIN 110.
In order to provide flow control, the requesting node waits for receipt of prior peek data before initiating a new peek request to the same memory.
When a poke request is received at PIN interface 400, PIN interface 400 determines that the request should be sent to peek/poke engine 410. In one embodiment, a poke request is sent in one or more data words and PIN interface 400 determines from the one or more data words that the request should be sent to peek/poke engine 410. Peek/poke engine 410 receives a poke address word from the requester and a poke data word to write to the address previously supplied by the poke address word. For example, registers including parameters 422 may have data written to them. Peek/poke engine 410 also determines from the one or more data words which register to write the data to.
After the data has been written, a poke acknowledgement may besent by peek/poke engine 410 to the requesting node through PIN 110 and PIN interface 400. Flow control can be realized by requiring a requesting node to wait for full acknowledgement before initiating a new poke to the same memory.
Fast track engine 412 is provided to perform memory access services that have a higher priority. Thus, fast track engine 412 allows requesting nodes to send requests and data in an expedited manner.
When a memory random access read or write is received at PIN interface 400, PIN interface 400 then provides the memory service request to random read/write engine 414. In one embodiment, a double word (32-bits) on a double word boundary may be read at a certain specified address or a burst read, which reads 16 double words on double word boundaries, may be performed.
In one embodiment, MRA read requests are placed in a queue and random read/write engine 414 services requests in a first in/first out methodology in one embodiment. When a request to memory 304 is ready, random read/write engine 414 sends the determined address with an indication of the appropriate memory that data should be read from to memory interface 404. The request is forwarded to memory 304 and data is read and returned to random read/write engine 414. The data can then be returned to the requesting node through PIN interface 400 and PIN 110.
In order to maintain flow control, in one embodiment, the requesting node waits for receipt of prior MRA read data before initiating a new MRA read or write to the same memory. Thus, the requesting node may make a first read request to SDRAM memory 406 and a second request to flash memory 408 simultaneously but cannot make multiple requests to SRAM memory 406 or flash memory 408.
When PIN 400 receives a MRA write request, it determines from one or more data words in the request that a MRA write should be performed. For example, a bit or any other indication may be set in the one or more data words to indicate the request is a MRA request. The request is then forwarded to random read/write engine 414, which determines a memory location from the one or more data words where the data should be written. The address is then stored in a table and when data for the write is received (either with the one or more data words containing the request or in one or more data words received later), the data is then stored in a temporary buffer. The MRA request is then placed in a queue. The queue is serviced in a first in/first out manner by random read/write engine 414.
When the MRA write request is serviced, the data is retrieved from the temporary buffer and written to the address included in the appropriate entry of the random address queue. In this case, the data, address, and which memory to write the data are sent to memory interface 404, which writes the data to either SDRAM memory 406 or flash memory 408 at the address specified. Random read/write engine 414 then sends a MRA write acknowledgement to the requesting node. Flow control is maintained because a requesting node waits for a MRA write acknowledgement before issuing a new random MRA read or write to the same memory.
A plurality of ports 418 are provided for the direct memory access (DMA), point-to-point (PTP), and real-time input (RTI) memory services. In one embodiment, each port includes DAG parameters and other parameters 422 and a temporary buffer 424. In a preferred embodiment the DAG is used to generate sequences of addresses for both reading and writing memory. For example, a node that desires to access a pattern of memory locations obtains the addresses from the DAG. The DAG can be configured in various ways such as, e.g., by a control node poking port configuration parameters. Another way to configure the DAG is dynamically via PTP control words. Details of the DAG are provided in following sections.
One or more DAG parameters 422 associated with a port 148 are used by DAG 420 to determine the appropriate data to retrieve from memory 304, or the appropriate location in memory to update. Other parameters can be included, such as temporary buffer parameters, control and status register bits, producer information, consumer information, counts, and the like.
In one embodiment, each of ports 418 include a temporary buffer 424. Temporary buffer 424 is used to store one or more PTP/DMA/RTI words that are received from a requesting node. When data is stored in temporary buffer 424, an indication of what kind of request associated with the stored data is stored in queue 426.
A PTP_DMA_Queue 426 is maintained by the PTP/DMA/RTI engine 416 for servicing of ports. Various events as described below cause a port to be placed on this first-in-first-out queue.
The services provided by PTP/DMA/RTI engine 416 will now be described.
Direct memory access services include a DMA read and a DMA write service. In a DMA read service, any of the ports 418 can serve as a source of a DMA channel set up by a requesting node 301. When a DMA read request for a port i in ports 418 is serviced, DAG 420 is configured with the DAG parameters for port i. Data is then read from memory 304, such as SDRAM memory 406 or flash memory 408, using the just configured DAG 420 by PTP/DMA/RTI engine 416.
The DMA read may be for a large chunk of data and multiple reads may be needed to read the entire requested chunk of data. Thus, memory controller 302 may send multiple chunks of data to a requesting node 301 in response to a DMA read. In one embodiment, counts are used to determine how much data to read. For example, chunks of data may be read in 32-bit words but the read request may be for seven bytes. The count would be set to seven and when the first word, which includes four bytes, is read, the count is decremented to three. When the next byte is read, the count is decremented to zero and only three bytes are read because the count was three. In some cases, multiple DMA reads may be serviced for a node.
In order to maintain flow control, memory controller 302 waits for a DMA read chunk acknowledgment from the requesting node before transmitting the next chunk of data. Also, PTP/DMA/RTI engine 416 waits for a DMA done message from the requesting node until a new DMA read from the same memory 304, such as SDRAM memory 406 or flash memory 408, is initiated.
PTP/DMA/RTI engine 416 can also perform a DMA write. Any of the ports in ports 418 may serve as the destination of a DMA channel set up by a requesting node. Temporary buffer 424 is provided in each of ports 418 in order to store incoming DMA data that is eventually written into memory 304. Although buffer 424 is described, it will be understood that buffer 424 may not be used and the data may be streamed to PTP/DMA/RTI engine 416. Because a DMA write might be a write for large amounts of data, the data may arrive in multiple data words over a period of time. When a DMA write request is received at a port i in ports 418, if port i's temporary buffers 424 are already full, an error message is sent to the requesting node. If not, the data is written sequentially into port i's temporary buffer 424 and a corresponding DMA write request is placed in queue 426. As more data is received on port i, the data is written sequentially into the port's temporary buffer 424 if it is not already full. When the last data word for the DMA write request is received on port i, a DMA write request is placed in queue 426. Although the above sequence is described, it will be understood that a person skilled in the art will appreciate other ways of handling the received data.
When the DMA write request is ready to be serviced by PTP/DMA/RTI engine 416, DAG 420 of PTP/DMA/RTI engine 416 is configured with DAG parameters 422 for port i. Each successive DMA write request is read from queue 426 and the corresponding data in port i's temporary buffer 424 is then written to memory 304, such as SDRAM memory 406 or flash memory 408, using the just configured DAG 420. DAG 420 may calculate addresses based on one or more parameters 422 associated with port I and an address associated with the applicable memory DMA request. The addresses may be calculated for each successive DMA write request and DAG 420 may be configured with parameters 422 for each write request.
In order to maintain flow control, the transmitting node waits for a chunk acknowledgment from memory controller 302 that indicates the chunk of data has been stored in temporary buffer 424 before transmitting the next chunk of data to be stored in port I's temporary buffer 424. Additionally, the requesting node waits for a DMA done message from memory controller 302 before initiating a new DMA write to the same memory 304.
In one embodiment, counts are used to determine how much data to write. For example, chunks of data may be received in 32-bit words. The write request may be for seven bytes. The count would be set to seven and when the first word, which includes four bytes, is received and written, the count is decremented to three. When the next word is received, the count is decremented to zero and only three bytes are written because the count was three.
Point-to-point memory services may also be performed by PTP/DMA/RTI engine 416. Nodes 301 may read and write memory 304 and update selected port parameters 422 via any of ports 418 using a point-to-point protocol. Memory controller 302 adheres to all point-to-point conventions, performs forward and backward ACKing, and also maintains counts for consumers and producers. Additionally, flow control is maintained because of the point-to-point conventions. For example, in a write request, neither temporary buffer 424 for ports 418 nor a buffer in memory 304 will overflow so long as the requesting node adheres to PTP conventions. Additionally, in a read request, memory controller 302 will not overflow the consuming node's input buffer as long as the requesting node adheres to PTP conventions.
PTP/DMA/RTI engine 416 may perform point-to-point memory services using a number of modes. For example, an auto-source mode provides an infinite source of data. A read occurs automatically when there is available space in a consuming node's input buffer and read requests are not used. An infinite-sink mode may be provided to provide an infinite sink for data. In this case, a write occurs when there is data in temporary buffer 424 and new data overwrites old data when the main buffer is full. In one embodiment, memory 304 includes a main buffer where data is written to. Thus, data is read from temporary buffer 424 and written to the main buffer. Although a main buffer is described, it will be understood that data may be written to other structures in memory 304. A finite-sink mode provides a finite sink for data. In this case, a write occurs when there is data in temporary buffer 424 and available space in the main buffer and writing stops when the main buffer is full. A buffer mode implements a first in/first out (FIFO) queue. In this case, writes fill the main buffer while reads drain the main buffer. A write occurs when there is data in the temporary buffer and available space in the main buffer. A read occurs when there is sufficient data in the main buffer and available space in the consuming-nodes input buffer. A basic mode provides unrestricted writing to a data structure. In this case, a write occurs when there is data in the temporary buffer, and old data in memory is overwritten. Also, the basic mode provides unrestricted reading of a data structure. A read occurs after an explicit read request is received and there is available space in the consuming nodes input buffer.
Data packets are received from a data source such as a distributor (e.g., from PIN Interface 400 of
Port parameters can be updated by information in “poke packets” or by control-word information in incoming PTP/DMA packets. The parameter update information is provided to parameter control system 602. Port parameters are used to define characteristics of a port for specific or desired functionality. For example, port parameters control characteristics of temporary buffers, removing control and data words from the temporary buffer for processing, unpacking data (double-) words into records in preparation for writing to main memory, writing and reading records to main memory, packing records read from memory into double-words and composing appropriate MIN words for transmission to the consumer node, sending various control words—forward and backward acknowledgements, DMA chunk acknowledgements and DMA Done messages—to the producer and consumer nodes; and other functions.
Unpacked data produced by unpacker 608 can include one or more records. Each record can be 8, 16 or 32 bits. A 4-bit byte select is sent with each 32-bit unpacked datum to indicate which of the bytes contain valid data and are to be written to memory.
Control words are used to specify parameters and other control information and are discussed in detail in the sections, below. For example, a control word can include information that indicates whether a parameter update is to be followed by a read using the updated port parameters.
Data address generator 606 is used to generate an address, or addresses, for use in reading from or writing to memory. The data address generator is configured by the DAG parameters included in the port parameters 602. Packer 612 is used to pack records received from memory into 32-bit data words for transmission to the consuming node. Packet assembly 610 is used to assemble the 32-bit data words into a standard PTP, DMA or RTI packets for transmission to the consuming node.
In a preferred embodiment, the XMC node adheres to the same network protocol conventions as other nodes in the ACE. Examples of ACE network protocols in the preferred embodiment include Peek/Poke, MRA, PTP, DMA, RTI, message, etc. This allows XMC nodes to benefit from the same scaling features and adaptable architecture of the overall system. Details of a network protocol used in the preferred embodiment can be found in the related patent application entitled “Uniform Interface for a Functional Node in an Adaptive Computing Engine,” referenced above.
In a preferred embodiment of the XMC there are 64 ports—each one a combination input/output port. Three matrix interconnect network (MIN) (also referred to as the programmable interconnect network (PIN)) protocols—Direct-Memory-Access (DMA), Point-To-Point (PTP) and Real-Time-Input (RTI)—make use of these ports for both writing data to and reading data from memory.
Memory addresses for both writing and reading are generated by a logical DAG associated with each port. This logical DAG is actually a set of DAG parameters that are used to configure a single physical DAG, as needed, for memory writes and reads.
Each port also has a temporary buffer to temporarily store incoming PTP/RTI/DMA words from the MIN. The total size of all 64 temporary buffers is 16 Kbytes arranged as 4K×33 bit words. The 33rd bit of each word indicates whether a double-word is a data word or a control word, as described below.
Each XMC port is associated with a set of parameters that define the characteristics of that port. These parameters configure the XMC hardware when a port is called upon to perform one of the following tasks:
The value of each port parameter can be either static or dynamic. If static, then the parameter is updated only by a poke from the K-Node. If dynamic, then the parameter can be updated by a poke from the K-Node and also during normal XMC operation.
The Control and Status Bits described in Table A are the parameters that direct the behavior of ports and define their mode of operation.
The two DMA Bits in Table B are used to control DMA transfers from and to the XMC respectively:
The DAG parameters in Table C—together with DAG_Address_Mode—determine the sequence of addresses generated by the port's Data Address Generator. See section 3.2 for more details.
The Temporary-Buffer Parameters in Table D define the size of temporary buffer of a port and provide the write-pointer and read-pointer needed to implement a circular first-in-first-out queue.
The Producer/Consumer Information in Table E is used in various fields in the MIN words that are sent to the Data Producer, Control Producer and Consumer.
The Counts in Table F provide flow control between (a) the Data and Control Producers and the XMC, (b) the temporary buffer and the main-memory buffer (when Buffer_Write=1) and (c) the XMC and the Consumer.
Table C, above, describes XMC DAG parameters. The 3 accessing modes (1-D, 2-D, and Bit_Reverse) are explained below. Special cases are also discussed relating to Y-Wrap and Burst Mode.
The DAG includes the ability to generate patterned addresses to memory. Three parameters—Index, Stride, and Limit—in each of X and Y define these patterns. In the simplest 1-dimensional case, the Index parameter is incremented by Stride, tested against the block size given by Limit, and then added to Origin to determine the final address.
Note that Stride is a signed quantity, and can be negative to enable stepping backwards through a block of memory addresses. If the Index is incremented/decremented outside the block (0 thru Limit-1), the Limit is subtracted/added respectively to bring the address back within the block. In this way, circular buffers with automatic wrap-around addressing are easily implemented. In general, any type of addressing, address incrementing/decrementing, indexing, etc., can be used with DAGs of different designs.
In a 1-D addressing mode, the DAG writes or reads addresses in a linear fashion. On each advance, DAG_X_Stride is added to DAG_X_Index, and the result tested greater than or equal to DAG_X_Limit and less than 0 (since DAG_X_Stride can be negative). In these cases, DAG_X_Index is decremented or incremented, respectively, by DAG_X_Limit, thus restoring it to the proper range.
When in 1-D Write Mode, only, the DAG uses the DAG_Y_Index, DAG_Y_Stride, and DAG_Y_Limit parameters, not X, to compute the write address. This is so that read operations can be performed concurrently, using the X parameters in the usual way, to create a circular buffer such as a FIFO.
In a 2-D addressing mode, the DAG writes or reads addresses in 2-dimensional “scan-line” order, utilizing both the X and Y parameters similarly to the 1-D mode. X advance is performed first, and an X Wrap (either + or −) causes a Y advance (and thus a potential Y Wrap as well). See the DAG advance pseudo-code description in section 3.2.4 below.
Note that Y parameters are always specified in units of bytes, not scan lines or data items.
Bit-reversed addressing is included in the hardware to enable implemention of Fast Fourier Transforms and other interleaved or “butterfly” computations. In this mode, bits within the DAG_X_Index field are reversed (swapped) just prior to using them in the memory address computation.
In Bit_Reverse mode, DAG_X_Stride is not used as an increment, but instead determines the range of bits to reverse within DAG_X_Index. Specifically, the DAG_X_Stride should be set to reverse(1)=2^(n−1)=1/2 the size of the block in bytes. Bits p through n−1 will be reversed in the DAG_X_Index, where p=0, 1, 2 for Record_Size of byte, word, and dword, respectively.
Example: For a 2^12=4096-point FFT in byte mode, parameters might be
DAG_X_Index=0x0, DAG_X_Stride=0x800, DAG_X_Limit=0x1000.
Thus the hardware will reverse bits 0-11, and the address sequence is
As in other modes, the resulting reversed DAG_X_Index value is added to the Origin address before being used to access memory.
In Bit_Reverse mode, note that the starting DAG_X_Index, the DAG_X_Limit, and the Origin are byte addresses specified normally—NOT bit-reversed. However, in this mode, the Origin must be on a double-word boundary, i.e. bits [1:0]=00;
Although the X Wrap mechanism works in Bit_Reverse mode, typically DAG_X_Index is initialized to 0 and a single array of 2^n values will be addressed once.
Combining the above parameter definitions, the calculation of the DAG memory addresses is as follows:
When the DAG is advanced:
Tables G-N, below, shows “for loop” representations in C pseudo-code of various DAG addressing modes. Capitalized names such as Origin, Index, Stride, Limit, etc. represent the corresponding DAG parameters. The examples below all assume Record_Size=Dword=4 bytes, and positive strides. Note that DAG parameters are always given in units of bytes, not records.
Any of the 64 PTP/DMA ports can serve as the source of a DMA channel set up by the K-Node/Host. In a preferred embodiment, only one DMA channel to/from memory at a time can be supported.
Actions
When Status_Register[i].DMA_Go is poked with a 1,
When a Service Request for Port i is serviced with
Control_Register[i].Port_Type=DMA and Register[i].DMA_Go=1:
The K-Node waits for a DMA Done message from the destination node before initiating a new DMA read/write from/to the same memory.
Direct-Memory-Access Write
Any of the PTP/DMA 64 ports can serve as the destination of a DMA channel set up by the K-Node/Host.
Actions
When a DMA Write from the MIN is received on Port i:
When a DMA Write Last Word Of Chunk from the MIN is received on Port i:
When a DMA Write Last Word from the MIN is received on Port i:
When a Service Request for DMA-Port i is serviced
The DMA source waits for a DMA Chunk Acknowledgement from the memory controller before transmitting the next chunk (chunk size must be less than or equal to the size of the port's temporary buffer).
The K-Node waits for DMA Done message from the memory controller before initiating a new DMA read/write from/to the same memory.
Nodes may read and write memory and update selected port parameters via any of the 64 ports of the memory controller using the point-to-point protocol. The memory controller performs forward and backward ACKing and maintains Consumer_Counts and Producer_Counts.
The memory controller recognizes a data word where the payload field contains data to be written to memory and a control word where the payload field contains port-update information and a bit indicating whether the update is to be followed by a read using the DAG. When the update is followed by a read request the control word is called a Read Request. Table I, below, shows different types of control words. PTP data words and PTP control words may be sent to a memory Port in any order and are processed in the order received.
Generally, data words and control words sent to the XMC are generated independently by separate tasks running on separate nodes. Therefore, when the XMC sends acknowledgements to the nodes to indicate that a control word or other message or information has been received, the XMC must send separate acknowledgments, with appropriate values, to the task or node that is producing data words. The task or node that is producing the data word is referred to as the “Data Producer”. A task or node that is producing control words is referred to as the “Control Producer.” The XMC maintains information on the Data Producer and Control Producer in order to properly send backward acknowledgements to both.
In general, tasks or nodes can be referred to as a “process” or as a component that performs processing. Although specific reference may be made to hardware or software components, it should be apparent that functions described herein may be performed by hardware, software or a combination of hardware and software.
In a preferred embodiment, all words—both data and control—arriving at a PTP/RTI port on the XMC are placed sequentially into the same temporary buffer. For a case where two types of words are generated independently, typically by different nodes, it is necessary to allocate a portion of the temporary buffer to data words and a portion to control words to prevent buffer overflow.
When a PTP Write, PTP Packet-Mode Write or RTI Write from the MIN is received on Port i the following actions are performed:
In a preferred embodiment the XMC operates in eight basic modes. These include the following:
Basic Mode—Provides unrestricted reading of and writing to a data structure. A write occurs when there is data in the temporary buffer and old data overwritten. A read occurs after an explicit read request has been received and there is available space in the input buffer consuming node. It does not consume data.
High-Speed-Write Mode—Similar to Basic Mode with the exception that read requests are not supported, thereby achieving higher throughput in writing to memory.
Finite-Sink Mode—Provides finite sink for data. A write occurs when there is data in the temporary buffer and available space in the main buffer. Writing stops when the main buffer is full.
Auto-Source Mode—Provides an infinite source of data. A read occurs automatically when there is available space in the input buffer of the consuming node. Read Requests are not used.
Buffer Mode—Implements a buffer/FIFO. Writes fill the main buffer while reads drain the main buffer. A write occurs when there is data in the temporary buffer and available space in the main buffer. A read occurs when there is sufficient data in the main buffer and available space in the consuming node's input buffer.
Y-Wrap Mode—Permits a write to memory to end in the middle of a double-word for the case when Record_Size is either byte or (16-bit) word.
Burst Mode—A special high-throughput mode for reading and writing 2-D blocks of bytes. Similar to Y-Wrap Mode in that writes to memory can end in the middle of a double-word.
Burst-Write Mode—Identical to Burst Mode except that—like High-Speed-Write Mode—read requests are not permitted. Achieves higher throughput than Burst Mode in writing to memory.
Basic Mode
Basic Mode supports writing to and reading from memory with no restrictions on Port_Type, DAG parameters or the use of PTP control words. Reads are initiated either by a read request when Port_Type is PTP, PTP_Packet_Mode or RTI or by poking a 1 into DMA_Go when Port_Type is DMA.
Table II lists the Control and Status Bit parameters that define Basic Mode.
Where:
High-Speed-Write Mode is similar to Basic Mode with the exception that read requests are not supported. This can allows advantages such as not requiring that Producer_Count<0 before words are removed from the temporary buffer is eliminated. Also, words can be removed from the temporary buffer at a higher rate.
Table III lists the Control and Status Bit parameters that define High-Speed-Write Mode.
Where:
Finite-Sink mode allows data to be written to memory and preserved from being overwritten by subsequent data. This is useful, for example, for storing statistics data, an error log, etc. Table IV lists the Control and Status Bit parameters that define Finite-Sink Mode.
Where:
An application may need to make use of tables of constants. For example, wave tables, pseudo-random data, etc., are typically written at system initialization and accessed in a continuous stream during real-time operation. Auto-Source Mode provides a means for accessing such data. Table V lists the Control and Status Bit parameters that define Auto-Source Mode.
Where:
In a preferred embodiment, a port in Buffer Mode implements a first-in-first-out queue. A delay line—a queue in which the amount of data in the queue remains above a threshold—is a form of FIFO and can also be implemented in Buffer Mode. Table VI lists the Control and Status Bit parameters that define Buffer Mode.
Where:
Y-Wrap Mode, along with Burst Mode and Burst-Write Mode, permit a write to memory to end in the middle of a double-word. Y-Wrap Mode can be used, for example, when writing a block of pixels (bytes) by rows into a two-dimensional frame buffer. In this case, the Y Wrap occurs when the last pixel of the block is written into memory. Any remaining bytes in the last data word are discarded and the next block of pixels begins with a new data word from the MIN. Table VII lists the Control and Status Bit parameters that define Y-Wrap Mode.
Where:
Burst Mode can be useful in imaging or video applications (e.g., MPEG4, HDTV, etc.) that have high bandwidth/throughput requirements. In a preferred embodiment, Burst Mode makes use of the Double Data Rate (DDR) feature of DDR DRAM. Other applications can use other types of memory and need not use the DDR feature. Burst Mode allows blocks of pixels to be either written to or read from memory at very high rates. Burst Mode terminates writing (and reading) of a double-word on an X-Wrap. This difference means that each line, not just each block, begins with a new double-word. Table VIII lists the Control and Status Bit parameters that define Burst Mode.
Where:
Burst-Write Mode allows higher throughput than Burst Mode by not supporting read requests and by not requiring Producer_Count<0 in order to begin processing words from the temporary buffer. Table IX lists the Control and Status Bit parameters that define Burst-Write Mode.
Where:
The features of the XMC can be used to advantage in different ways depending on a specific application. For example, in a “data-sinking” application it is sometimes necessary to store information about system performance (e.g., statistics or an error log) in memory. The data may have to be stored in real time and prevented from being overwritten by subsequent data. An XMC port configured in Finite-Sink Mode can provide that capability. The parameter settings for this mode are shown in Table X, below.
Real-time data are written into a buffer in memory until the buffer becomes full whereupon writing ceases. The data can be read at any time via a read request.
Another application is known as “data sourcing”. Applications sometimes require a large or unbounded stream of fixed data—pseudo-random data or a wave table, for example—during real-time operation.
To provide the stream an XMC port can be configured in Auto-Source Mode accessing a circular buffer in memory containing the fixed data configured according to Table XI. The fixed data—which is typically written into the buffer at system initialization—can be supplied automatically to the consumer node, the flow being governed by normal PTP flow control using Forwards and Backwards ACKs. Because the buffer is circular and Buffer_Read is turned off, the port provides an infinite source of data.
Another type of application may require implementation of “delay lines.” For example, digital audio broadcast, personal video recorders, modeling of acoustics, etc., types of applications can require a signal to be delayed by a number of samples. This requirement usually means that there will always be a certain minimum number of samples in the delay line once the line reaches steady-state operation (once the number of samples in the delay line reaches a threshold).
A delay line is implemented using a single port configured in Buffer Mode with Record_Size set to double-word as shown in Table XII. The circular buffer in main memory is accessed by DAG_X_Index for reading and DAG_Y_Index for writing. The initial value of Consumer_Count determines the length/size of the delay line: it is initialized to minus the size of the delay, converted to bytes.
For example, to implement a delay line of 1,000,000 double-words, a buffer of at least 4,000,000 bytes is allocated in memory and Consumer_Count is initialized to −4,000,000 as illustrated in Table. Because of the initial value of Consumer_Count, no output appears until at least 1,000,000 double-words have been written into the buffer and Consumer_Count has been incremented by a cumulative value of at least +4,000,000 (by Forward ACKs from the Data Producer). After that threshold has been reached and Consumer_Count has been driven non-negative, an auto read occurs.
In this example, the consumer node expects to get data from the delay line in blocks of 100 double-words, and so Read_Count is set to 100 (records). Upon an auto read, 100 double-words are removed from the buffer and sent to the Consumer (assuming Producer_Count<0). Consumer_Count is then decremented by 400 (bytes). If the new value of Consumer_Count is still non-negative, then another auto read occurs and the cycle is repeated. If the new value of Consumer_Count is negative, then reading is inhibited until additional double-words are written into the buffer and Consumer_Count is again driven non-negative.
In summary, once the number of samples in the delay line reaches at least 1,000,000 and Consumer_Count becomes non-negative, Consumer_Count never drops below −400 and the number of double-words in the delay line never drops below 999,900.
Another type of application may require “data reordering” in which the elements in a block of data need to be reordered. Table XIII illustrates an application—sometimes called a corner-turner or corner-bender—that interchanges the rows and columns of a two-dimensional block of data. The application example uses two XMC ports—Write Port i and Read Port j—both accessing the same two-dimensional buffer in memory.
For example, bytes can be written four at a time to memory by rows (lines) using Port i, which has the DAG, configured in 1-D mode. (2-D mode could have been used, but 1-D is simpler and generates the same sequence of addresses.) When the Data Producer receives acknowledgement from the XMC that all data has been written to main memory, it signals the Consumer to begin reading. The Consumer sends a backwards ACK to XMC Port j thereby driving Producer_Count negative and enabling a read.
Bytes are read from memory by columns using Port j with the DAG in 2-D mode. But because reading is by columns and not rows, the usual roles of DAG_X_Index and DAG_Y_Index are reversed. DAG_X_Index now indexes successive bytes in a column, and DAG_Y_Index now indexes successive columns in the 2-D block. More precisely,
DAG_X_Index=R X line-length
DAG_Y_Index=C
where R and C are the row and column, respectively, of a byte in the 2-D block. After each byte is read, DAG_X_Index is incremented by line-length thereby accessing the next byte in the column. After the last byte in the column is read, DAG_X_Index reaches L X line-length, where L is the number of lines (rows) in the 2-D block. But L X line-length=buffer-size=DAG_X_Limit and therefore DAG_X_Index wraps around to 0 and DAG_Y_Index is incremented by 1. The cycle is repeated for each column until DAG_Y_Index=line−length=DAG_Y_Limit, the indication that the entire block has been read. When the Consumer receives the entire block of data, it signals the Data Producer to begin writing once again.
The XMC allows interlacing, or multiplexing, of multiple data streams into a single data stream. In Table XIV two streams arriving on XMC Ports i and j are combined in memory and then read from memory via XMC Port k.
In a preferred embodiment interlacing of the two streams is accomplished by writing bytes arriving on Port i to even byte addresses in the main-memory buffer, and writing bytes arriving on Port j to odd byte addresses. (Note that when DAG_Y_Index for Port i wraps around it returns to 0, but when DAG_Y_Index for Port j wraps around it returns to 1.)
Synchronizing of writing and reading is accomplished using a double-buffering scheme in which the two Data Producers write into one half of the main-memory buffer while the Consumer reads the other half. To make the scheme work, each Data Producer signals the Consumer when it receives acknowledgement from the XMC that buffer-size/4 bytes have been written into the main-memory buffer. When the Consumer receives a signal from each Data Producer, it sends a backwards ACK to XMC Port k thereby driving Producer_Count negative and enabling a read of the interlaced data. When the Consumer receives buffer-size/2 bytes of interlaced data, it signals each Data Producer that they are permitted to write into the buffer half just read.
Data de-interlacing (de-multiplexing) is accomplished whereby instead of merging two data streams into one, one data stream is separated into two.
Table XV illustrates an application that reverses the interlacing operation described in the preceding section. The input data stream arrives on XMC Port i and the two de-interlaced streams exit the XMC via Ports j and k. De-interlacing is accomplished by reading even bytes in the main-memory buffer using Port j and odd bytes using Port k. (Note that when DAG_X_Index for Port j wraps around it returns to 0, but when DAG_X_Index for Port k wraps around it returns to 1.)
Synchronizing of writing and reading is accomplished using a double-buffering scheme in which the Data Producer writes into one half of the main-memory buffer while the two Consumers read the other half. To make the scheme work, the Data Producer notifies the Consumers when it receives acknowledgement from the XMC that buffer-size/2 bytes have been written into the buffer. When the two Consumers receive the signal, they each send a backwards ACK to their XMC read port thereby driving Producer_Count negative and enabling a read of the de-interlaced data. When each Consumer receives buffer-size/4 bytes of data, it notifies the Data Producer that reading of the half buffer has been completed. The Data Producer waits until it receives notification from both Consumers before it begins writing into the just-vacated half buffer.
Many video compression algorithms (e.g., MPEG) require reading numerous rectangular blocks of pixels (bytes) from a frame buffer. Table le XVI illustrates an application in which data are written sequentially into a frame buffer via XMC Port i and in which rectangular blocks within the frame are read via XMC Port j.
A Data Producer for Port i writes data into the frame buffer line-by-line via Port i, and when it receives acknowledgement from the XMC that the entire frame has been written to memory, it notifies the Control Producer for Port j.
A Control Producer for Port j then sends a separate read request (see Section Error! Reference source not found.) to Port j for each block of pixels to be read, the parameter-update value in the request being used to update DAG_Origin. This newly updated value for DAG_Origin determines the location of the block to be read. The remaining DAG parameters determine the size of the block to be read. Table illustrates the parameter settings for a 9×9 block of pixels (bytes).
The XMC provides a scheme employing indirect addressing. In indirect addressing data is accessed in two steps: (1) an address (pointer) is used to access a second address (pointer) and (2) this second address is used in turn to access user data. The XMC implements indirect addressing via two tables, Table A and Table B, both residing in main memory as shown in Table XVII. Table A—which is accessed via XMC Port j—contains pointers into Table B. Table B—which is accessed via XMC Port k—contains user data.
Port j is configured in Auto-Source Mode and the entries in Table 1 are read automatically, in order, and sent via PTP control words from XMC Port j to XMC Port k. (Note the Consumer_ID and Consumer_Port for Port j.) Normal PTP flow control between Port j and Port k guarantees that the input buffer on Port k never overflows.
Each entry in Table A has a format where bit 31 (TableAEntry[31]) is set to 1, bits 30-28 (TableAEntry[30:28]) are set to 001 and bits 27-0 are used for the new DAG_X_Index value. TableAEntry[30:28]=001 indicates that DAG_X_Index[k] is to be updated with the value in TableAEntry[27:0]. TableAEntry[31]=1 indicates that the update is to be immediately followed by a read of Table B.
Port k responds to read requests from Port j as it would from any other source. It updates the appropriate DAG parameter—DAG_X_Index in this case—and then sends Read_Count records to the consumer of user data. Normal PTP flow control between XMC Port k and the data consumer guarantees that the data-consumer's input buffer never overflows.
Although the invention has been described with respect to specific embodiments thereof, these embodiments are merely illustrative, and not restrictive, of the invention. For example, although a PIN has been described as a data transfer mechanism other embodiments can use any type of network or interconnection scheme.
Any suitable programming language can be used to implement the routines of the present invention including C, C++, Java, assembly language, etc. Different programming techniques can be employed such as procedural or object oriented. The routines can execute on a single processing device or multiple processors. Although the steps, operations or computations may be presented in a specific order, this order may be changed in different embodiments. In some embodiments, multiple steps shown as sequential in this specification can be performed at the same time. The sequence of operations described herein can be interrupted, suspended, or otherwise controlled by another process, such as an operating system, kernel, etc. The routines can operate in an operating system environment or as stand-alone routines occupying all, or a substantial part, of the system processing.
In the description herein, numerous specific details are provided, such as examples of components and/or methods, to provide a thorough understanding of embodiments of the present invention. One skilled in the relevant art will recognize, however, that an embodiment of the invention can be practiced without one or more of the specific details, or with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the like. In other instances, well-known structures, materials, or operations are not specifically shown or described in detail to avoid obscuring aspects of embodiments of the present invention.
A “computer-readable medium” for purposes of embodiments of the present invention may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, system or device. The computer readable medium can be, by way of example only but not by limitation, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, system, device, propagation medium, or computer memory.
A “processor” or “process” includes any human, hardware and/or software system, mechanism or component that processes data, signals or other information. A processor can include a system with a general-purpose central processing unit, multiple processing units, dedicated circuitry for achieving functionality, or other systems. Processing need not be limited to a geographic location, or have temporal limitations. For example, a processor can perform its functions in “real time,” “offline,” in a “batch mode,” etc. Portions of processing can be performed at different times and at different locations, by different (or the same) processing systems.
Reference throughout this specification to “one embodiment”, “an embodiment”, or “a specific embodiment” means that particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention and not necessarily in all embodiments. Thus, respective appearances of the phrases “in one embodiment”, “in an embodiment”, or “in a specific embodiment” in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics of any specific embodiment of the present invention may be combined in any suitable manner with one or more other embodiments. It is to be understood that other variations and modifications of the embodiments of the present invention described and illustrated herein are possible in light of the teachings herein and are to be considered as part of the spirit and scope of the present invention.
Embodiments of the invention may be implemented by using a programmed general purpose digital computer, by using application specific integrated circuits, programmable logic devices, field programmable gate arrays, optical, chemical, biological, quantum or nanoengineered systems, components and mechanisms may be used. In general, the functions of the present invention can be achieved by any means as is known in the art. Distributed, or networked systems, components and circuits can be used. Communication, or transfer, of data may be wired, wireless, or by any other means.
It will also be appreciated that one or more of the elements depicted in the drawings/figures can also be implemented in a more separated or integrated manner, or even removed or rendered as inoperable in certain cases, as is useful in accordance with a particular application. It is also within the spirit and scope of the present invention to implement a program or code that can be stored in a machine-readable medium to permit a computer to perform any of the methods described above.
Additionally, any signal arrows in the drawings/Figures should be considered only as exemplary, and not limiting, unless otherwise specifically noted. Furthermore, the term “or” as used herein is generally intended to mean “and/or” unless otherwise indicated. Combinations of components or steps will also be considered as being noted, where terminology is foreseen as rendering the ability to separate or combine is unclear.
As used in the description herein and throughout the claims that follow, “a”, “an”, and “the” includes plural references unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
The foregoing description of illustrated embodiments of the present invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed herein. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes only, various equivalent modifications are possible within the spirit and scope of the present invention, as those skilled in the relevant art will recognize and appreciate. As indicated, these modifications may be made to the present invention in light of the foregoing description of illustrated embodiments of the present invention and are to be included within the spirit and scope of the present invention.
Thus, while the present invention has been described herein with reference to particular embodiments thereof, a latitude of modification, various changes and substitutions are intended in the foregoing disclosures, and it will be appreciated that in some instances some features of embodiments of the invention will be employed without a corresponding use of other features without departing from the scope and spirit of the invention as set forth. Therefore, many modifications may be made to adapt a particular situation or material to the essential scope and spirit of the present invention. It is intended that the invention not be limited to the particular terms used in following claims and/or to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include any and all embodiments and equivalents falling within the scope of the appended claims.
This application claims priority from U.S. Provisional Patent Application No. 60/428,646, filed on Nov. 22, 2002. This application is a continuation of U.S. patent application Ser. No. 13/178,125 filed Jul. 7, 2011, which is a continuation of U.S. patent application Ser. No. 12/251,871 filed Oct. 10, 2008 which is a continuation of U.S. patent application Ser. No. 12/115,843 filed May 6, 2008, now U.S. Pat. No. 7,743,220, which is a continuation of U.S. patent application Ser. No. 11/803,998 filed May 16, 2007, now U.S. Pat. No. 7,451,280, which is a continuation of U.S. patent application Ser. No. 10/719,921, now U.S. Pat. No. 7,225,301 filed Nov. 20, 2003 which claims priority from U.S. Provisional Application No. 60/428,646. Priority is claimed from all of these applications and these applications are hereby incorporated by reference as if set forth in full in this application for all purposes. This application is related to the following U.S. patent applications, each of which is hereby incorporated by reference as if set forth in full in this document for all purposes: Ser. No. 09/815,122, entitled “Adaptive Integrated Circuitry with Heterogeneous and Reconfigurable Matrices of Diverse and Adaptive Computational Units having Fixed, Application Specific Computational Elements,” filed on Mar. 22, 2001; Ser. No. 10/443,554, entitled “Uniform Interface for a Functional Node in an Adaptive Computing Engine,” filed on May 21, 2003.
Number | Name | Date | Kind |
---|---|---|---|
4302775 | Widergren et al. | Nov 1981 | A |
4380046 | Fung et al. | Apr 1983 | A |
4578799 | Scholl et al. | Mar 1986 | A |
4633386 | Terepin et al. | Dec 1986 | A |
4649512 | Nukiyama | Mar 1987 | A |
4694416 | Wheeler et al. | Sep 1987 | A |
4713755 | Worley, Jr. et al. | Dec 1987 | A |
4748585 | Chiarulli et al. | May 1988 | A |
4758985 | Carter | Jul 1988 | A |
4760525 | Webb | Jul 1988 | A |
4760544 | Lamb | Jul 1988 | A |
4811214 | Nosenchuck et al. | Mar 1989 | A |
4870302 | Freeman | Sep 1989 | A |
4905231 | Leung et al. | Feb 1990 | A |
4967340 | Dawes | Oct 1990 | A |
5021947 | Campbell et al. | Jun 1991 | A |
5090015 | Dabbish et al. | Feb 1992 | A |
5099418 | Pian et al. | Mar 1992 | A |
5144166 | Camarota et al. | Sep 1992 | A |
5165023 | Gifford | Nov 1992 | A |
5193151 | Jain | Mar 1993 | A |
5202993 | Tarsy et al. | Apr 1993 | A |
5218240 | Camarota et al. | Jun 1993 | A |
5245227 | Furtek et al. | Sep 1993 | A |
5261099 | Bigo et al. | Nov 1993 | A |
5301100 | Wagner | Apr 1994 | A |
5335276 | Thompson et al. | Aug 1994 | A |
5336950 | Popli et al. | Aug 1994 | A |
5339428 | Burmeister et al. | Aug 1994 | A |
5361362 | Benkeser et al. | Nov 1994 | A |
5379343 | Grube et al. | Jan 1995 | A |
5381546 | Servi et al. | Jan 1995 | A |
5381550 | Jourdenais et al. | Jan 1995 | A |
5388062 | Knutson | Feb 1995 | A |
5388212 | Grube et al. | Feb 1995 | A |
5428754 | Baldwin | Jun 1995 | A |
5450557 | Kopp et al. | Sep 1995 | A |
5465368 | Davidson et al. | Nov 1995 | A |
5475856 | Kogge | Dec 1995 | A |
5479055 | Eccles | Dec 1995 | A |
5490165 | Blakeney, II et al. | Feb 1996 | A |
5491823 | Ruttenberg | Feb 1996 | A |
5507009 | Grube et al. | Apr 1996 | A |
5515519 | Yoshioka et al. | May 1996 | A |
5517600 | Shimokawa | May 1996 | A |
5519694 | Brewer et al. | May 1996 | A |
5522070 | Sumimoto | May 1996 | A |
5530964 | Alpert et al. | Jun 1996 | A |
5534796 | Edwards | Jul 1996 | A |
5555417 | Odnert et al. | Sep 1996 | A |
5560028 | Sachs et al. | Sep 1996 | A |
5560038 | Haddock | Sep 1996 | A |
5572572 | Kawan et al. | Nov 1996 | A |
5590353 | Sakakibara et al. | Dec 1996 | A |
5594657 | Cantone et al. | Jan 1997 | A |
5600810 | Ohkami | Feb 1997 | A |
5600844 | Shaw et al. | Feb 1997 | A |
5602833 | Zehavi | Feb 1997 | A |
5603043 | Taylor et al. | Feb 1997 | A |
5623545 | Childs et al. | Apr 1997 | A |
5625669 | McGregor et al. | Apr 1997 | A |
5630206 | Urban et al. | May 1997 | A |
5635940 | Hickman et al. | Jun 1997 | A |
5646544 | Iadanza | Jul 1997 | A |
5646545 | Trimberger et al. | Jul 1997 | A |
5684793 | Kiema et al. | Nov 1997 | A |
5684980 | Casselman | Nov 1997 | A |
5694613 | Suzuki | Dec 1997 | A |
5701398 | Glier et al. | Dec 1997 | A |
5701482 | Harrison et al. | Dec 1997 | A |
5704053 | Santhanam | Dec 1997 | A |
5706191 | Bassett et al. | Jan 1998 | A |
5712996 | Schepers | Jan 1998 | A |
5720002 | Wang | Feb 1998 | A |
5721693 | Song | Feb 1998 | A |
5721854 | Ebicioglu et al. | Feb 1998 | A |
5729754 | Estes | Mar 1998 | A |
5734808 | Takeda | Mar 1998 | A |
5737631 | Trimberger | Apr 1998 | A |
5742180 | DeHon et al. | Apr 1998 | A |
5742821 | Prasanna | Apr 1998 | A |
5751295 | Becklund et al. | May 1998 | A |
5754227 | Fukuoka | May 1998 | A |
5758261 | Wiedeman | May 1998 | A |
5768561 | Wise | Jun 1998 | A |
5771362 | Bartkowiak et al. | Jun 1998 | A |
5778439 | Trimberger et al. | Jul 1998 | A |
5784636 | Rupp | Jul 1998 | A |
5784699 | McMahon et al. | Jul 1998 | A |
5787237 | Reilly | Jul 1998 | A |
5790817 | Asghar et al. | Aug 1998 | A |
5794062 | Baxter | Aug 1998 | A |
5794067 | Kadowaki | Aug 1998 | A |
5802055 | Krein et al. | Sep 1998 | A |
5802278 | Isfeld et al. | Sep 1998 | A |
5818603 | Motoyama | Oct 1998 | A |
5822308 | Weigand et al. | Oct 1998 | A |
5822313 | Malek et al. | Oct 1998 | A |
5822360 | Lee et al. | Oct 1998 | A |
5828858 | Athanas et al. | Oct 1998 | A |
5835753 | Witt | Nov 1998 | A |
5838165 | Chatter | Nov 1998 | A |
5838894 | Horst | Nov 1998 | A |
5860021 | Klingman | Jan 1999 | A |
5870427 | Tiedemann, Jr. et al. | Feb 1999 | A |
5873045 | Lee et al. | Feb 1999 | A |
5881106 | Cartier | Mar 1999 | A |
5884284 | Peters et al. | Mar 1999 | A |
5886537 | Macias et al. | Mar 1999 | A |
5887174 | Simons et al. | Mar 1999 | A |
5889816 | Agrawal et al. | Mar 1999 | A |
5890014 | Long | Mar 1999 | A |
5892900 | Ginter et al. | Apr 1999 | A |
5892961 | Trimberger | Apr 1999 | A |
5892962 | Cloutier | Apr 1999 | A |
5894473 | Dent | Apr 1999 | A |
5903886 | Heimlich et al. | May 1999 | A |
5907580 | Cummings | May 1999 | A |
5910733 | Bertolet et al. | Jun 1999 | A |
5912572 | Graf, III | Jun 1999 | A |
5913172 | McCabe et al. | Jun 1999 | A |
5917852 | Butterfield et al. | Jun 1999 | A |
5920801 | Thomas et al. | Jul 1999 | A |
5931918 | Row et al. | Aug 1999 | A |
5933642 | Greenbaum et al. | Aug 1999 | A |
5940438 | Poon et al. | Aug 1999 | A |
5949415 | Lin et al. | Sep 1999 | A |
5950011 | Albrecht et al. | Sep 1999 | A |
5950131 | Vilmur | Sep 1999 | A |
5951674 | Moreno | Sep 1999 | A |
5953322 | Kimball | Sep 1999 | A |
5956518 | DeHon et al. | Sep 1999 | A |
5959881 | Trimberger et al. | Sep 1999 | A |
5963048 | Harrison et al. | Oct 1999 | A |
5966534 | Cooke et al. | Oct 1999 | A |
5970254 | Cooke et al. | Oct 1999 | A |
5987611 | Freund | Nov 1999 | A |
5991302 | Berl et al. | Nov 1999 | A |
5991308 | Fuhrmann et al. | Nov 1999 | A |
5999734 | Willis et al. | Dec 1999 | A |
6005943 | Cohen et al. | Dec 1999 | A |
6006249 | Leong | Dec 1999 | A |
6016395 | Mohamed | Jan 2000 | A |
6018783 | Chiang | Jan 2000 | A |
6021186 | Suzuki et al. | Feb 2000 | A |
6021492 | May | Feb 2000 | A |
6023742 | Ebeling et al. | Feb 2000 | A |
6023755 | Casselman | Feb 2000 | A |
6028610 | Deering | Feb 2000 | A |
6041322 | Meng et al. | Mar 2000 | A |
6046603 | New | Apr 2000 | A |
6047115 | Mohan et al. | Apr 2000 | A |
6052600 | Fette et al. | Apr 2000 | A |
6055314 | Spies et al. | Apr 2000 | A |
6056194 | Kolls | May 2000 | A |
6059840 | Click, Jr. | May 2000 | A |
6061580 | Altschul et al. | May 2000 | A |
6073132 | Gehman | Jun 2000 | A |
6076174 | Freund | Jun 2000 | A |
6078736 | Guccione | Jun 2000 | A |
6088043 | Kelleher et al. | Jul 2000 | A |
6091263 | New et al. | Jul 2000 | A |
6091765 | Pietzold, III et al. | Jul 2000 | A |
6094065 | Tavana et al. | Jul 2000 | A |
6094726 | Gonion et al. | Jul 2000 | A |
6111893 | Volftsun et al. | Aug 2000 | A |
6111935 | Hughes-Hartogs | Aug 2000 | A |
6115751 | Tam et al. | Sep 2000 | A |
6119178 | Martin et al. | Sep 2000 | A |
6120551 | Law et al. | Sep 2000 | A |
6122670 | Bennett et al. | Sep 2000 | A |
6128307 | Brown | Oct 2000 | A |
6134605 | Hudson et al. | Oct 2000 | A |
6134629 | L'Ecuyer | Oct 2000 | A |
6141283 | Bogin et al. | Oct 2000 | A |
6150838 | Wittig et al. | Nov 2000 | A |
6154492 | Araki et al. | Nov 2000 | A |
6154494 | Sugahara et al. | Nov 2000 | A |
6157997 | Oowaki et al. | Dec 2000 | A |
6173389 | Pechanek et al. | Jan 2001 | B1 |
6175854 | Bretscher | Jan 2001 | B1 |
6175892 | Sazzad et al. | Jan 2001 | B1 |
6185418 | MacLellan et al. | Feb 2001 | B1 |
6192070 | Poon et al. | Feb 2001 | B1 |
6192255 | Lewis et al. | Feb 2001 | B1 |
6192388 | Cajolet | Feb 2001 | B1 |
6195788 | Leaver et al. | Feb 2001 | B1 |
6198924 | Ishii et al. | Mar 2001 | B1 |
6199181 | Rechef et al. | Mar 2001 | B1 |
6202130 | Scales, III et al. | Mar 2001 | B1 |
6202189 | Hinedi et al. | Mar 2001 | B1 |
6219697 | Lawande et al. | Apr 2001 | B1 |
6219756 | Kasamizugami | Apr 2001 | B1 |
6219780 | Lipasti | Apr 2001 | B1 |
6223222 | Fijolek et al. | Apr 2001 | B1 |
6226387 | Tewfik et al. | May 2001 | B1 |
6230307 | Davis et al. | May 2001 | B1 |
6237029 | Master et al. | May 2001 | B1 |
6246883 | Lee | Jun 2001 | B1 |
6247125 | Noel-Baron et al. | Jun 2001 | B1 |
6249251 | Chang et al. | Jun 2001 | B1 |
6263057 | Silverman | Jul 2001 | B1 |
6266760 | DeHon et al. | Jul 2001 | B1 |
6272579 | Lentz et al. | Aug 2001 | B1 |
6272616 | Fernando et al. | Aug 2001 | B1 |
6281703 | Furuta et al. | Aug 2001 | B1 |
6282627 | Wong et al. | Aug 2001 | B1 |
6289375 | Knight et al. | Sep 2001 | B1 |
6289434 | Roy | Sep 2001 | B1 |
6289488 | Dave et al. | Sep 2001 | B1 |
6292822 | Hardwick | Sep 2001 | B1 |
6292827 | Raz | Sep 2001 | B1 |
6301653 | Mohamed et al. | Oct 2001 | B1 |
6305014 | Roediger et al. | Oct 2001 | B1 |
6311149 | Ryan et al. | Oct 2001 | B1 |
6326806 | Fallside et al. | Dec 2001 | B1 |
6346824 | New | Feb 2002 | B1 |
6347346 | Taylor | Feb 2002 | B1 |
6349394 | Brock et al. | Feb 2002 | B1 |
6353841 | Marshall et al. | Mar 2002 | B1 |
6356994 | Barry et al. | Mar 2002 | B1 |
6359248 | Mardi | Mar 2002 | B1 |
6360256 | Lim | Mar 2002 | B1 |
6360259 | Bradley | Mar 2002 | B1 |
6360263 | Kurtzberg et al. | Mar 2002 | B1 |
6366999 | Drabenstott et al. | Apr 2002 | B1 |
6378072 | Collins et al. | Apr 2002 | B1 |
6381293 | Lee et al. | Apr 2002 | B1 |
6381735 | Hunt | Apr 2002 | B1 |
6385751 | Wolf | May 2002 | B1 |
6405214 | Meade, II | Jun 2002 | B1 |
6408039 | Ito | Jun 2002 | B1 |
6410941 | Taylor et al. | Jun 2002 | B1 |
6411612 | Halford et al. | Jun 2002 | B1 |
6421372 | Bierly et al. | Jul 2002 | B1 |
6421809 | Wuytack et al. | Jul 2002 | B1 |
6426649 | Fu et al. | Jul 2002 | B1 |
6430624 | Jamtgaard et al. | Aug 2002 | B1 |
6433578 | Wasson | Aug 2002 | B1 |
6434590 | Blelloch et al. | Aug 2002 | B1 |
6438737 | Morelli et al. | Aug 2002 | B1 |
6456996 | Crawford, Jr. et al. | Sep 2002 | B1 |
6459883 | Subramanian et al. | Oct 2002 | B2 |
6467009 | Winegarden et al. | Oct 2002 | B1 |
6469540 | Nakaya | Oct 2002 | B2 |
6473609 | Schwartz et al. | Oct 2002 | B1 |
6483343 | Faith et al. | Nov 2002 | B1 |
6507947 | Schreiber et al. | Jan 2003 | B1 |
6510138 | Pannell | Jan 2003 | B1 |
6510510 | Garde | Jan 2003 | B1 |
6538470 | Langhammer et al. | Mar 2003 | B1 |
6556044 | Langhammer et al. | Apr 2003 | B2 |
6563891 | Eriksson et al. | May 2003 | B1 |
6570877 | Kloth et al. | May 2003 | B1 |
6577678 | Scheuermann | Jun 2003 | B2 |
6587684 | Hsu et al. | Jul 2003 | B1 |
6590415 | Agrawal et al. | Jul 2003 | B2 |
6601086 | Howard et al. | Jul 2003 | B1 |
6601158 | Abbott et al. | Jul 2003 | B1 |
6604085 | Kolls | Aug 2003 | B1 |
6604189 | Zemlyak et al. | Aug 2003 | B1 |
6606529 | Crowder, Jr. et al. | Aug 2003 | B1 |
6611908 | Lentz et al. | Aug 2003 | B2 |
6615295 | Shah | Sep 2003 | B2 |
6615333 | Hoogerbrugge et al. | Sep 2003 | B1 |
6618434 | Heidari-Bateni et al. | Sep 2003 | B2 |
6618777 | Greenfield | Sep 2003 | B1 |
6640304 | Ginter et al. | Oct 2003 | B2 |
6647429 | Semal | Nov 2003 | B1 |
6653859 | Sihlbom et al. | Nov 2003 | B2 |
6675265 | Barroso et al. | Jan 2004 | B2 |
6675284 | Warren | Jan 2004 | B1 |
6684319 | Mohamed et al. | Jan 2004 | B1 |
6691148 | Zinky et al. | Feb 2004 | B1 |
6694380 | Wolrich et al. | Feb 2004 | B1 |
6711617 | Bantz et al. | Mar 2004 | B1 |
6718182 | Kung | Apr 2004 | B1 |
6721286 | Williams et al. | Apr 2004 | B1 |
6721884 | De Oliveira Kastrup Pereira et al. | Apr 2004 | B1 |
6732354 | Ebeling et al. | May 2004 | B2 |
6735621 | Yoakum et al. | May 2004 | B1 |
6738744 | Kirovski et al. | May 2004 | B2 |
6751723 | Kundu et al. | Jun 2004 | B1 |
6754470 | Hendrickson et al. | Jun 2004 | B2 |
6760587 | Holtzman et al. | Jul 2004 | B2 |
6760833 | Dowling | Jul 2004 | B1 |
6766165 | Sharma et al. | Jul 2004 | B2 |
6775758 | Shah | Aug 2004 | B2 |
6778212 | Deng et al. | Aug 2004 | B1 |
6782336 | Shah | Aug 2004 | B2 |
6785341 | Walton et al. | Aug 2004 | B2 |
6807590 | Carlson et al. | Oct 2004 | B1 |
6819140 | Yamanaka et al. | Nov 2004 | B2 |
6823448 | Roth et al. | Nov 2004 | B2 |
6829633 | Gelfer et al. | Dec 2004 | B2 |
6832250 | Coons et al. | Dec 2004 | B1 |
6836839 | Master et al. | Dec 2004 | B2 |
6859434 | Segal et al. | Feb 2005 | B2 |
6865664 | Budrovic et al. | Mar 2005 | B2 |
6871236 | Fishman et al. | Mar 2005 | B2 |
6883074 | Lee et al. | Apr 2005 | B2 |
6883084 | Donohoe | Apr 2005 | B1 |
6889283 | Shah | May 2005 | B2 |
6894996 | Lee | May 2005 | B2 |
6901440 | Bimm et al. | May 2005 | B1 |
6901467 | Shah et al. | May 2005 | B2 |
6907598 | Fraser | Jun 2005 | B2 |
6912515 | Jackson et al. | Jun 2005 | B2 |
6941336 | Mar | Sep 2005 | B1 |
6950897 | Hensley et al. | Sep 2005 | B2 |
6980515 | Schunk et al. | Dec 2005 | B1 |
6985517 | Matsumoto et al. | Jan 2006 | B2 |
6986021 | Master et al. | Jan 2006 | B2 |
6986142 | Ehlig et al. | Jan 2006 | B1 |
6988139 | Jervis et al. | Jan 2006 | B1 |
7028116 | Shah | Apr 2006 | B2 |
7032229 | Flores et al. | Apr 2006 | B1 |
7044741 | Leem | May 2006 | B2 |
7082456 | Mani-Meitav et al. | Jul 2006 | B2 |
7139910 | Ainsworth et al. | Nov 2006 | B1 |
7142731 | Toi | Nov 2006 | B1 |
7249242 | Ramchandran | Jul 2007 | B2 |
20010003191 | Kovacs et al. | Jun 2001 | A1 |
20010023482 | Wray | Sep 2001 | A1 |
20010029515 | Mirsky | Oct 2001 | A1 |
20010034795 | Moulton et al. | Oct 2001 | A1 |
20010039654 | Miyamoto | Nov 2001 | A1 |
20010048713 | Medlock et al. | Dec 2001 | A1 |
20010048714 | Jha | Dec 2001 | A1 |
20010050948 | Ramberg et al. | Dec 2001 | A1 |
20020010848 | Kamano et al. | Jan 2002 | A1 |
20020013799 | Blaker | Jan 2002 | A1 |
20020013937 | Ostanevich et al. | Jan 2002 | A1 |
20020015435 | Rieken | Feb 2002 | A1 |
20020023210 | Tuomenoksa et al. | Feb 2002 | A1 |
20020024942 | Tsuneki et al. | Feb 2002 | A1 |
20020024993 | Subramanian et al. | Feb 2002 | A1 |
20020031166 | Subramanian et al. | Mar 2002 | A1 |
20020032551 | Zakiya | Mar 2002 | A1 |
20020035623 | Lawande et al. | Mar 2002 | A1 |
20020041581 | Aramaki | Apr 2002 | A1 |
20020042907 | Yamanaka et al. | Apr 2002 | A1 |
20020061741 | Leung et al. | May 2002 | A1 |
20020069282 | Reisman | Jun 2002 | A1 |
20020072830 | Hunt | Jun 2002 | A1 |
20020078337 | Moreau et al. | Jun 2002 | A1 |
20020083247 | Shah | Jun 2002 | A1 |
20020083257 | Shah | Jun 2002 | A1 |
20020083305 | Renard et al. | Jun 2002 | A1 |
20020083423 | Ostanevich et al. | Jun 2002 | A1 |
20020087829 | Snyder et al. | Jul 2002 | A1 |
20020089348 | Langhammer | Jul 2002 | A1 |
20020101909 | Chen et al. | Aug 2002 | A1 |
20020107905 | Roe et al. | Aug 2002 | A1 |
20020107962 | Richter et al. | Aug 2002 | A1 |
20020108004 | Shah | Aug 2002 | A1 |
20020119803 | Bitterlich et al. | Aug 2002 | A1 |
20020120672 | Butt et al. | Aug 2002 | A1 |
20020120799 | Shah | Aug 2002 | A1 |
20020120805 | Hensley et al. | Aug 2002 | A1 |
20020133688 | Lee et al. | Sep 2002 | A1 |
20020138716 | Master et al. | Sep 2002 | A1 |
20020141489 | Imaizumi | Oct 2002 | A1 |
20020147845 | Sanchez-Herrero et al. | Oct 2002 | A1 |
20020159503 | Ramachandran | Oct 2002 | A1 |
20020162026 | Neuman et al. | Oct 2002 | A1 |
20020168018 | Scheuermann | Nov 2002 | A1 |
20020181559 | Heidari-Bateni et al. | Dec 2002 | A1 |
20020184275 | Dutta et al. | Dec 2002 | A1 |
20020184291 | Hogenauer | Dec 2002 | A1 |
20020184498 | Qi | Dec 2002 | A1 |
20020191790 | Anand et al. | Dec 2002 | A1 |
20030007606 | Suder et al. | Jan 2003 | A1 |
20030012270 | Zhou et al. | Jan 2003 | A1 |
20030018446 | Makowski et al. | Jan 2003 | A1 |
20030018700 | Giroti et al. | Jan 2003 | A1 |
20030023830 | Hogenauer | Jan 2003 | A1 |
20030026242 | Jokinen et al. | Feb 2003 | A1 |
20030030004 | Dixon et al. | Feb 2003 | A1 |
20030046421 | Horvitz et al. | Mar 2003 | A1 |
20030060995 | Shah | Mar 2003 | A1 |
20030061260 | Rajkumar | Mar 2003 | A1 |
20030061311 | Lo | Mar 2003 | A1 |
20030063656 | Rao et al. | Apr 2003 | A1 |
20030074473 | Pham et al. | Apr 2003 | A1 |
20030076815 | Miller et al. | Apr 2003 | A1 |
20030099223 | Chang et al. | May 2003 | A1 |
20030102889 | Master et al. | Jun 2003 | A1 |
20030105949 | Master et al. | Jun 2003 | A1 |
20030110485 | Lu et al. | Jun 2003 | A1 |
20030131162 | Secatch et al. | Jul 2003 | A1 |
20030142818 | Raghunathan et al. | Jul 2003 | A1 |
20030154357 | Master et al. | Aug 2003 | A1 |
20030163723 | Kozuch et al. | Aug 2003 | A1 |
20030172138 | McCormack et al. | Sep 2003 | A1 |
20030172139 | Srinivasan et al. | Sep 2003 | A1 |
20030200538 | Ebeling et al. | Oct 2003 | A1 |
20030212684 | Meyer et al. | Nov 2003 | A1 |
20030229864 | Watkins | Dec 2003 | A1 |
20040006584 | Vandeweerd | Jan 2004 | A1 |
20040010645 | Scheuermann et al. | Jan 2004 | A1 |
20040015970 | Scheuermann | Jan 2004 | A1 |
20040025159 | Scheuermann et al. | Feb 2004 | A1 |
20040057505 | Valio | Mar 2004 | A1 |
20040062300 | McDonough et al. | Apr 2004 | A1 |
20040081248 | Parolari | Apr 2004 | A1 |
20040093479 | Ramchandran | May 2004 | A1 |
20040133745 | Ramchandran | Jul 2004 | A1 |
20040168044 | Ramchandran | Aug 2004 | A1 |
20040225774 | Shah et al. | Nov 2004 | A1 |
20050010711 | Shah | Jan 2005 | A1 |
20050044344 | Stevens | Feb 2005 | A1 |
20050166038 | Wang et al. | Jul 2005 | A1 |
20050166073 | Lee | Jul 2005 | A1 |
20050198199 | Dowling | Sep 2005 | A1 |
20060031660 | Master et al. | Feb 2006 | A1 |
Number | Date | Country |
---|---|---|
100 18 374 | Oct 2001 | DE |
0 661 831 | Jul 1995 | EP |
0 668 659 | Aug 1995 | EP |
0 690 588 | Jan 1996 | EP |
0 691 754 | Jan 1996 | EP |
0 768 602 | Apr 1997 | EP |
0 817 003 | Jan 1998 | EP |
0 821 495 | Jan 1998 | EP |
0 923 247 | Jun 1999 | EP |
0 926 596 | Jun 1999 | EP |
1 056 217 | Nov 2000 | EP |
1 061 437 | Dec 2000 | EP |
1 061 443 | Dec 2000 | EP |
1 126 368 | Aug 2001 | EP |
1 150 506 | Oct 2001 | EP |
1 189 358 | Mar 2002 | EP |
2 067 800 | Jul 1981 | GB |
2 237 908 | May 1991 | GB |
62-249456 | Oct 1987 | JP |
63-147258 | Jun 1988 | JP |
4-51546 | Feb 1992 | JP |
7-064789 | Mar 1995 | JP |
7066718 | Mar 1995 | JP |
10233676 | Sep 1998 | JP |
10254696 | Sep 1998 | JP |
11296345 | Oct 1999 | JP |
2000315731 | Nov 2000 | JP |
2001-053703 | Feb 2001 | JP |
WO 9313603 | Jul 1993 | WO |
WO 9633558 | Oct 1996 | WO |
WO 9832071 | Jul 1998 | WO |
WO 9921094 | Apr 1999 | WO |
WO 0019311 | Apr 2000 | WO |
WO 0065855 | Nov 2000 | WO |
WO 0069073 | Nov 2000 | WO |
WO 0122235 | Mar 2001 | WO |
WO 0176129 | Oct 2001 | WO |
WO 0212978 | Feb 2002 | WO |
Entry |
---|
Abnous et al., “Ultra-Low-Power Domain-Specific Multimedia Processors,” VLSI Signal Processing, IX, 1998, IEEE Workshop in San Francisco, CA, USA, Oct. 30-Nov. 1, 1998, pp. 461-470 (Oct. 30, 1998). |
Aggarwal et al.., “Efficient Huffman Decoding,” International Conference on Image Processing IEEE 1:936-939 (Sep. 10-13, 2000). |
Allan et al., “Software Pipelining,” ACM Computing Surveys, 27(3):1-78 (Sep. 1995). |
Alsolaim et al., “Architecture and Application of a Dynamically Reconfigurable Hardware Array for Future Mobile Communication Systems,” Field Programmable Custom Computing Machines, 2000 IEEE Symposium, Napa Valley, Los Alamitos, CA. IEEE Comput. Soc. pp. 205-214 (Apr. 17-19, 2000). |
Ashenden et al., “The VHDL Cookbook,” Dept. Computer Science, University of Adelaide, South Australia. Downloaded from http://tams-www.informatik.uni-hamburg.de/vhdl/doc/cookbook/VHDL-Cookbook.pdf on Dec. 7, 2006 (Jul. 1990). |
Bacon et al., “Compiler Transformations for High-Performance Computing,” ACM Computing Surveys 26(4):368-373 (Dec. 1994). |
Balasubramonian et al., “Reducing the Complexity of the Register File in Dynamic Superscalar Processors,” Proceedings of the 34th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 237-248 (Dec. 1, 2001). |
Banerjee et al., “A MATLAB Compiler for Distributed, Heterogeneous, Reconfigurable Computing Systems,” 2000 IEEE Symposium, pp. 39-48, (Apr. 17-19, 2000). |
Bapte et al., “Uniform Execution Environment for Dynamic Reconfiguration,” Darpa Adaptive Computing Systems, http://isis.vanderbilt.edu/publications/archive/babty—T—#—0—1999—Uniform—Ex.pdf, pp. 1-7 (1999). |
Baumgarte et al., “PACT XPP—A Self-Reconfigurable Data Processing Architecture,” NN www.pactcorp.com/sneu/download/ersa01.pdf; retrieved on Nov. 25, 2005 (Jun. 25, 2001). |
Becker et al., “An Application-Tailored Dynamically Reconfigurable Hardware Architecture for Digital Baseband Processing,” IEEE Conference Proceedings Article pp. 341-346 (Sep. 18, 2000). |
Becker et al., “Design and Implementation of a Coarse-Grained Dynamically Reconfigurable Hardware Architecture,” VLSI 2001, Proceedings IEEE Computer Soc. Workshop, Piscataway, NJ, USA, pp. 41-46 (Apr. 19-20, 2001). |
Bishop & Loucks, “A Heterogeneous Environment for Hardware/Software Cosimulation,” Proceedings of the 30th Annual Simulation Symposium, pp. 14-22 (Apr. 7-9, 1997). |
Brakensiek et al., “Re-Configurable Multi-Standard Terminal for Heterogeneous Networks,” Radio and Wireless Conference, Rawcon 2002 IEEE. pp. 27-30 (2002). |
Brown et al., “Quick PDA Data Exchange,” PC Magazine pp. 1-3 (May 22, 2001). |
Buck et al., “Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems,” International Journal of Computer Simulation 4:155-182 (Apr. 1994). |
Burns et al., “A Dynamic Reconfiguration Run-Time System,” Proceedings of the 5th Annual Symposium on Field-Programmable Custom Computing Machines, pp. 1 66-75 (Apr. 16, 1997). |
Business Wire, “Whirlpool Internet-Enabled Appliances to Use Beeline Shopper Software Features,” http://www.whirlpoocorp.com/news/releases/release.asp?rid=90 (Feb. 16, 2001). |
Buttazzo et al., “Optimal Deadline Assignment for Scheduling Soft Aperiodic Tasks in Hard Real-Time Environments,” Engineering of Complex Computer Systems, Proceedings of the Third IEEE International Conference on Como, pp. 39-48 (Sep. 8, 1997). |
Callahan et al., “Adapting Software Pipelining for Reconfigurable Computing,” in Proceedings of the International Conference on Compilers, Architectrue and Synthesis for Embedded Systems p. 8, ACM (CASES '00, San Jose, CA) (Nov. 17-18, 2000). |
Chapman & Mehrotra, “OpenMP and HPF: Integrating Two Paradigms,” Proceedings of the 4th International Euro-Par Conference (Euro-Par'98), Springer-Verlag Heidelberg, Lecture Notes in Computer Science 1470:650-658 (1998). |
Chen et al., “A Reconfigurable Multiprocessor IC for Rapid Prototyping of Algorithmic-Specific High-Speed DSP Data Paths,” IEEE Journal of Solid-State Circuits, IEEE 35:74-75 (Feb. 1, 2001). |
Clarke, “Embedded Solutions Enters Development Pact with Marconi,” EETimes Online (Jan. 26, 2000). |
Compton & Hauck, “Reconfigurable Computing: A Survey of Systems and Software,” ACM Press, ACM Computing Surveys (CSUR) 34(2):171-210 (Jun. 2002). |
Compton et al., “Configuration Relocation and Defragmentation for Run-Time Reconfigurable Computing,” Northwestern University, http://citeseer.nj.nec.com/compton00configuration.html, pp. 1-17 (2000). |
Conte et al., “Dynamic Rescheduling: A Technique for Object Code Compatibility in VLIW Architectures,” Proceedings of the 28th Annulal International Symposium on Microarchitecture pp. 208-218 (Nov. 29, 1995). |
Conte et al., “Instruction Fetch Mechanisms for VLIW Architectures with Compressed Encodings,” Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 29:201-211 (Dec. 2, 1996). |
Cray Research Inc., “Cray T3E Fortran Optimization Guide,” Ver. 004-2518-002, Section 4.5 (Jan. 1999). |
Cummings et al., “FPGA in the Software Radio,” IEEE Communications Magazine . 37(2):108-112 (Feb. 1999). |
Dandalis et al., “An Adaptive Cryptograhic Engine for IPSec Architectures,” IEEE pp. 132-141 (Jan. 2000). |
David et al., “DART: A Dynamically Reconfigurable Architecture Dealing with Future Mobile Telecommunication Constraints,” Proceedings of the International Parallel and Distributed Processing Symposium pp. 156-163 (Apr. 15, 2002). |
Deepakumara et al., “FPGA Implementation of MD5 has Algorithm,” Canadian Conference on Electrical and Computer Engineering, IEEE (2001). |
Dehon et al., “Reconfigurable Computing: What, Why and Implications for Design Automation,” Design Automation Conference Proceedings pp. 610-615 (1999). |
Dipert, “Figuring Out Reconfigurable Logic,” EDN 44(16):107-114 (Aug. 5, 1999). |
Dominikus, “A Hardware Implementation of MD4-Family Hash Algorithms,” 9th International Conference on Electronics, Circuits and Systems IEEE (2002). |
Dorband, “aCe C Language Reference Guide,” Online (Archived Mar. 2001), http://web.archive.org/web/20000616053819/http://newton.gsfc.nasa.gov/aCe/aCe—dir/aCe—cc—Ref.html (Mar. 2001). |
Drozdowski, “Scheduling Multiprocessor Tasks—An Overview,” Instytut Informatyki Politechnika, pp. 1-31 (Jan. 31, 1996). |
Ebeling et al., “RaPiD Reconfigurable Pipelined Datapath,” Springer-Verlag, 6th International Workshop on Field-Programmable Logic and Applications pp. 126-135 (1996). |
Fawer et al., “A Multiprocessor Approach for Implementing a Time-Diversity Spread Specturm Receiver,” Proceeding sof the 1990 International Zurich Seminal on Digital Communications, pp. 173-180 (Mar. 5-8, 1990). |
Fisher, “Gone Flat,” Forbes pp. 76-79 (Oct. 2001). |
Fleischmann et al., “Prototyping Networked Embedded Systems,” Integrated Engineering, pp. 116-119 (Feb. 1999). |
Forbes “Best of the Web—Computer Networking/Consumer Durables,” The Forbes Magnetic 40 p. 80 (May 2001). |
Gibson, “Fresh Technologies Will Create Myriad Functions,” FT Information Technology Review; World Wide Web at http://technews.acm.org/articles/2000-2/0301w.html?searchterm=%22fresh+technologies%22 (Mar. 1, 2000). |
Gluth, “Integrierte Signalprozessoren,” Elektronik 35(18):112-118 Franzis Verlag GMBH, Munich, Germany (Sep. 5, 1986). |
Gokhale & Schlesinger, “A Data Parallel C and Its Platforms,” Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation pp. 194-202 (Frontiers '95) (Feb. 1995). |
Grimm et al., “A System Architecture for Pervasive Computing,” Washington University, pp. 1-6 (Sep. 2000). |
Halbwachs et al., “The Synchronous Data Flow Programming Language LUSTRE,” Proceedings of the IEEE 79(9):1305-1319 (Sep. 1991). |
Hammes et al., “Cameron: High Level Language Compilation for Reconfigurable Systems,” Proc. of the Intl. Conf. on Parallel Architectures and Compilation Techniques, pp. 236-244 (Oct. 1999). |
Hartenstein, “Coarse Grain Reconfigurable Architectures,” Design Automation Conference, 2001. Proceedings of the ASP-DAC 2001, Asian and South Pacific Jan. 30, 2001-Feb. 2, 2001, Piscataway, Nj, US, IEEE, pp. 564-569 (Jan. 30, 2001). |
Heinz, “An Efficiently Compilable Extension of {M}odula-3 for Problem-Oriented Explicitly Parallel Programming,” Proceedings of the Joint Symposium on Parallel Processing (May 1993). |
Hinden et al., “The DARPA Internet: Interconnecting Heterogeneous Computer Networks with Gateways,” IEEE Computer Magazine pp. 38-48 (1983). |
Horton, “Beginning Java 2: JDK 1.3 Edition,” Wrox Press, Chapter 8, pp. 313-316 (Feb. 2001). |
Huff et al., “Lifetime-Sensitive Modulo Scheduling,” 6th Conference on Programming Language, Design and Implementation, pp. 258-267, ACM (1993). |
IBM, “Multisequencing a Single Instruction Stream Scheduling with Space-time Trade-offs,” IBM Technical Disclosure Bulletin 36(2):105-108 (Feb. 1, 1993). |
IEEE, “IEEE Standard Verilog Hardware Description Language,” downloaded from http://inst.eecs.berkeley.edu/˜cs150/fa06/Labs/verilog-ieee.pdf on Dec. 7, 2006 (Sep. 2001). |
Internet Wire, Sunbeam Joins Microsoft in University Plug and Play Forum to Establish a “Universal” Smart Appliance Technology Standard (Mar. 23, 2000). |
Ishii et al., “Parallel Variable Length Decoding with Inverse Quantization for Software MPEG-2 Decoders,” Workshop on Signal Processing Systems, Design and Implementation, IEEE pp. 500-509 (Nov. 3-5, 1997). |
Jain et al., “An Alternative Approach Towards the Design of Control Units,” Microelectronics and Reliability 24(6):1009-1012 (1984). |
Jain, “Parallel Processing with the TMS320C40 Parallel Digital Signal Processor,” Sonitech International Inc., pp. 13-46. Retrieved from: http://www-s.ti.com/sc/psheets/spra031/spra031.pdf retrieved on Apr. 14, 2004 (Feb. 1994). |
Janssen et al., “Partitioned Register File for TTAs,” Proceedings of the 28th Annual International Symposium on Microarchitecture, pp. 303-312 (Nov. 1995). |
Jong-Pyng et al. “Real-Time Virtual Channel Flow Control,” Proceedings of the Annual International Phoenix Conference on Computers and Communications, Conf. 13 pp. 97-103 (Apr. 12, 1994). |
Jung et al., “Efficient Hardware Controller Synthesis for Synchronous Dataflow Graph in System Level Design,” Proceedings of the 13th International Symposium on System Synthesis pp. 79-84 (ISSS'00) (Sep. 2000). |
Kaufmann et al., “Digital Spread-Spectrum Multipath-Diversity Receiver for Indoor Communication,” from Pioneers to the 21st Century; Denver, Proceedings of the Vehicular Technology Socity [sic] Conference, NY, IEEE, US 2(Conf. 42):1038-1041 (May 10-13, 1992). |
Kneip et al., “An Algorithm Adapted Autonomous Controlling Concept for a Parallel Single-Chip Digital Signal Processor,” Journal of VLSI Signal Processing Systems for Signal, Image, an dVideo Technology 16(1):31-40 (May 1, 1997). |
Lee & Messerschmitt, “Pipeline Interleaved Programmable DSP's: Synchronous Data Flow Programming,” IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-35(9):1334-1345 (Sep. 1987). |
Lee & Messerschmitt, “Synchronous Data Flow,” Proceedings of the IEEE 75(9):1235-1245 (Sep. 1987). |
Lee & Parks, “Dataflow Process Networks,” Proceedings of the IEEE 83(5):773-799 (May 1995). |
Liu et al., “Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment,” Journal of the Association for Computing 20(1):46-61 (1973). |
Llosa et al., “Lifetime-Sensitive Modulo Scheduling in a Production Environment,” IEEE Trans. on Comps. 50(3):234-249 (Mar. 2001). |
Lu et al., “The Morphosys Dynamically Reconfigurable System-On-Chip,” Proceedings of the First NASA/DOD Workshop on Evolvable Hardware, pp. 152-160 (Jul. 19, 1999). |
Mangione-Smith et al., “Seeking Solutions in Configurable Computing,” Computer 30(12):38-43 (Dec. 1997). |
Manion, “Network CPU Adds Spice,” Electronic Engineering Times, Issue 1126 (Aug. 14, 2000). |
Mascia & Ishii., “Neural Net Implementation on Single-Chip Digital Signal Processor,” IEEE (1989). |
McGraw, “Parallel Functional Programming in Sisal: Fictions, Facts, and Future,” Lawrence Livermore National Laboratory pp. 1-40 (Jul. 1993). |
Najjar et al., “High-Level Language Abstraction for Reconfigurable Computing,” Computer 36(8):63-69 (Aug. 2003). |
Nichols et al., “Data Management and Control-Flow Constructs in a SIMD/SPMD Parallel Language/Compiler,” Proceedings of the 3rd Symposium on the Frontiers of Massively Parallel Computation pp. 397-406 (Oct. 1990). |
OpenMP Architecture Review Board, “OpenMP C and C++ Application Program Interface,” pp. 7-16 (Oct. 1998). |
Oracle Corporation, “Oracle8i JDBC Developer's Guide and Reference,” Release 3, 8.1.7, pp. 10-8-10-10 (Jul. 2000). |
Pauer et al., “Algorithm Analysis and Mapping Environment for Adaptive Computing Systems: Further Results,” Proc. IEEE Symposium on FPGA's for Custom Computing Machines (FCCM), Napa CA (1999). |
Pauer et al., “Algorithm Analysis and Mapping Environment for Adaptive Computing Systems,” Presentation slides, Third Bi-annual Ptolemy Miniconference (1999). |
Ramamritham et al., “On Scheduling Algorithms for Real-Time Multiprocessor Systems,” Algorithms and Applications, Proceedings of the International conference on Parallel Processing 3:143-152 (Aug. 8, 1989). |
Schneider, “A Parallel/Serial Trade-Off Methodology for Look-Up Table Based Decoders,” Proceedings of the Design Automation Conference 34:498-503 (Jun. 9-13, 1997). |
Sidhu et al., “A Self-Reconfigurable Gate Array Architecture,” 10 International Workshop on Field Programmable Logic and Applications http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/17524/http:zSzzSzmaarcii.usc.eduzSzPublicationsZSzsidhu—fp100.pdf/sidhu0Oselfreconfigurable.pdf retrieved on Jun. 21, 2006 (Sep. 1, 2001). |
Smith, “Intro to ASICs: ASIC Cell Libraries,” at http://iroi.seu.edu.cn/books/asics/Book2/CH01/CH01.5.htm, printed on Feb. 4, 2005 (Jun. 1997). |
Souza, “ Computing's New Face—Reconfigurable Devices Could Rattle Supply Chain,” Electronic Buyers' News Issue 1205, p. P.1 (Apr. 3, 2000). |
Souza, “Quicksilver Buys White Eagle,” Electronic Buyers News, Issue 1220 (Jul. 17, 2000). |
Sriram et al., “MPEG-2 Video Decoding on the TMS320C6X DSP Architecture,” Conference Record of the 32nd Asilomar Conference on Signals, Systems, and Computers, IEEE pp. 1735-1739 (Nov. 1-4, 1998). |
Sun Microsystems, “FORTRAN 3.0.1 User's Guide, Revision A,” pp. 57-68 (Aug. 1994). |
Svensson, “Co's Join on Home Web Wiring Network,” Associated Press Online printed on Apr. 30, 2008 (Jun. 2000). |
Tang et al., “Thread Partitioning and Scheduling Based on Cost Model,” Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 272-281 Retrieved from: http://doi.acm.org/10.1145/258492.2585 retrieved on Aug. 25, 2004 (1997). |
U.S. Appl. No. 10/719,921 Office Action Mailed Jun. 14, 2006. |
U.S. Appl. No. 11/803,998 Office Action Mailed Jul. 25, 2007. |
U.S. Appl. No. 12/115,843 Office Action Mailed Sep. 9, 2009. |
Vaya, “VITURBO: A Reconfigurable Architecture for Ubiquitous Wireless Networks,” A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree Master of Science; RICE University (Aug. 2002). |
Wang et al., “Cell Search in W-CDMA,” IEEE Journal on Selected Areas in Communications 18(8):1470-1482 (Aug. 2000). |
Whiting & Pascoe, “A History of Data-Flow Languages,” IEEE Annals of the History of Computing 16(4):38-59 (1994). |
Williamson & Lee, “Synthesis of Parallel Hardware Implementations from Synchronous Dataflow Graph Specifications,” Conference Record of the Thirtieth Asilomar Conference on Signals, Systems and Computers 1340-1343 (Nov. 1996). |
Wirthlin et al., “A Dynamic Instruction Set Computer,” Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines, pp. 99-107 (Apr. 21, 1995). |
www.appliancemagazine.com, World Wide Web at http://web.archive.org/web/20000511085402/http://www.appliancemagazine.com/ printed on Apr. 30, 2008. |
Number | Date | Country | |
---|---|---|---|
20130013872 A1 | Jan 2013 | US |
Number | Date | Country | |
---|---|---|---|
60428646 | Nov 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13178125 | Jul 2011 | US |
Child | 13609949 | US | |
Parent | 12251871 | Oct 2008 | US |
Child | 13178125 | US | |
Parent | 12115843 | May 2008 | US |
Child | 12251871 | US | |
Parent | 11803998 | May 2007 | US |
Child | 12115843 | US | |
Parent | 10719921 | Nov 2003 | US |
Child | 11803998 | US |