This invention broadly relates to a computer architecture particularly adapted for high bandwidth, high concurrency and multitasking operations. In a conventional computing system the central processing unit (CPU), main memory and input/output (I/O) devices are connected by a bus. A “bus master” or “bus arbiter” controls and directs data traffic among the components of the computing system. Main memory is used as the principal site for storing data. An “access” to main memory writes data to or reads data from main memory. Making an access (or “accessing”) is typically preceded by a request for access from another is component of the system, such as the CPU or an I/O device, followed by a grant of permission by the bus arbiter.
There are two principal types of accesses. The first type is a data access, in which large amounts of data are written to or read from main memory. A data access may be on the order of thousands of bytes. The second type is a control/status access, characterized by a small number of reads or writes to a defined data structure in order to report the status of an input/output device, process data, or initiate some input/output activity. In contrast to data accesses, a control/status access is usually on the order of a few bits. Control accesses are generally initiated by the CPU, while status accesses are generally initiated by the I/O devices.
Referring to
In systems having more than one system bus, for example, as shown in
A dual port memory design allows access of a single memory from two busses. A typical dual port memory design is shown in U.S. Pat. No. 4,796,232. The '232 design provides access to a multiple bank, DRAM memory through two ports. A logic circuit arbitrates between read/write requests from the ports and DRAM refresh requests. The logic circuit allows one memory bank to be refreshed while another bank is accessed by a read or write to a port. The '232 design also uses a data register between each bus and the memory banks. A data register will accept, for example, a data element written from a bus thereby freeing that bus for other activity. However, subsequent data elements can not be written from that bus until the data element in the register is written into memory. The transfer of the data element from the register into memory may involve some delays because it must compete with transfer requests from the other bus and with refresh requests.
U.S. Pat. No. 4,656,614 to Suzuki discloses an apparatus usable to multiple simultaneous accesses to a memory. Suzuki describes an individual memory block made up of an array of memory bit cells. Suzuki describes a method for concurrently accessing two memory bit cells within the same memory block—this is well known today as a dual-port or a multi-port memory, as shown in
Another example of a system having more than one system bus is shown in
According to one aspect of the present invention, there is provided an apparatus comprising a memory, a plurality of devices and an interface for controlling access to the memory by each device, wherein the interface is arranged to control memory access so that the plurality of devices can access different parts of the memory substantially simultaneously.
In one embodiment, one or more devices comprise a bus, for example, a data bus or system bus. In this arrangement, a single interface is used to control memory accesses to different parts or elements of a memory substantially simultaneously so that a plurality of, or multiple memory accesses can be performed at the same time. Advantageously, providing a single interface to control memory accesses allows the circuitry required to implement this functionality to be significantly reduced in comparison to the example provided above, in which each memory element has its own system bus interface. The use of a single memory interface to control access to a plurality of memory elements by different data buses significantly reduces the capacitive loading on the data buses, allowing the buses to run at higher speeds. Furthermore, the interface is arranged to permit different data buses (or other devices) to access different parts of the memory or different memory elements at the same time, or in parallel. This significantly improves the efficiency of the system and increases the bandwidth of the memory in comparison to the above examples in which each memory interface allows only one system bus to access the memory at any one time. In addition, this arrangement allows the use of single port memories which are much smaller than dual port memories, and allows a plurality of single port memories to be accessed at the same time.
In some embodiments, the different parts of the memory or memory elements are arranged side by side in a row and/or in a column, and may be arranged in a 1-dimensional, 2-dimensional, or 3-dimensional array. Each memory part or element may comprise a discrete memory.
Each memory part or memory element may be a single ported-type memory, e.g. having a single row or column of I/Os, or may comprise a dual ported-type memory having two rows or columns of I/Os. Each memory part or memory element may comprise a contiguous array of data storage elements.
Embodiments of the invention may comprise three or more memory elements, a plurality of which can be selectively accessed independently at the same time. Thus, unlike a dual ported memory, which only allows two accesses at the same time, the present arrangement allows the memory to be more flexibly configured so that any number of buses or other devices can access the memory at the same time.
In some embodiments, the memory may be controlled by control signals which control all parts of the memory in the same way at the same time. In this case, the interface may permit different system buses to access different columns of memory at the same time. Data output from different columns or input to different columns may each have different row addresses, or the row addresses may be the same.
In other embodiments, operation of different parts or elements of the memory may be controlled independently of one another, so that, for example, one part of the memory can be placed in a data write mode and at the same time, another part of the memory can be placed in a data read mode. In this case, the different parts may comprise different memory elements.
In some embodiments, the memory comprises a plurality of memory elements, each memory element having a separate control for a read access and a write access, and the interface is adapted to enable a write access to at least one memory element by a data bus and a read access to at least one other memory element by another data bus substantially simultaneously or in parallel.
In some embodiments, the interface is responsive to requests by each data bus for a read access, to connect each data bus to a different memory element substantially simultaneously for a read access.
A respective read data bus may be connected between each memory element and the interface for carrying data from a respective memory element to the interface.
In some embodiments, a read data bus may be connected to a plurality of memory elements and to the interface for carrying data from the memory elements to the interface, and connected such that the read data bus is shared between the memory elements.
In some embodiments, the interface is responsive to requests by each data bus for a write access, to connect each data bus to a different memory element substantially simultaneously for a write access.
In some embodiments, a respective write data bus may be connected between each memory element and the interface for carrying data from the interface to a respective memory element.
In some embodiments, a write data bus may be connected to a plurality of memory elements and to the interface for carrying data from the interface to the memory elements, and connected such that the write data bus is shared between the memory elements.
In some embodiments, the interface is responsive to requests by each data bus for a read access, to connect each data bus to a different part or element of the memory substantially simultaneously for a read access.
In some embodiments, the memory has a single control for enabling read accesses thereto. For example, the interface may generate a common read control which is used to control a plurality of different memory elements.
In some embodiments, the memory comprises a plurality of columns of memory elements, each column having its own address bus and its own (internal or local) data bus, and the interface may be adapted to connect different columns of memory elements to different data buses substantially simultaneously or in parallel.
In some embodiments, the memory comprises a plurality of rows of memory elements, each row having its own address bus and its own (internal or local) data bus, and the interface may be adapted to connect different rows of memory elements to different data buses substantially simultaneously or in parallel.
The interface may be responsive to requests by each data bus for a write access, to connect each data bus to a different part or element of the memory substantially simultaneously for a write access.
In some embodiments, the memory has a single control for enabling write accesses thereto. For example, the interface may generate a common write control which is used to control a plurality of different memory elements.
In some embodiments, the memory comprises a plurality of columns of memory elements, each column having its own address bus and (internal or local) data bus, and the interface may be adapted to connect different columns of memory elements to different data buses substantially simultaneously or in parallel, to permit parallel write accesses.
In some embodiments, the memory comprises a plurality of rows of memory elements, each row having its own address bus and its own (internal or local) data bus, and the interface may be adapted to connect different rows of memory storage elements to different data buses substantially simultaneously.
In some embodiments, a plurality of processor elements may be coupled to the memory. In some embodiments, the memory may comprise a plurality of memory elements and one or more processor elements may be coupled to each memory element.
In some embodiments, the apparatus comprises a controller for controlling operations of the processor elements. For example, the controller may be adapted to control operations of each processor element substantially simultaneously. In some embodiments, the controller may be adapted to control each processor element to perform the same function substantially simultaneously.
According to another aspect of the present invention, there is provided an apparatus comprising a plurality of memories, an interface, a plurality of devices coupled to said interface, and wherein said interface is adapted to control access to said memories so that said plurality of devices can access different memories substantially simultaneously.
According to another aspect of the present invention, there is provided an apparatus comprising a plurality of memories, an interface, a plurality of devices coupled to said interface, and a plurality of address buses, wherein one address bus is couplable to at least one of said memories and another address bus is couplable to at least one other of said memories.
According to another aspect of the present invention, there is provided an interface for controlling access to a plurality of memories by one or more devices comprising means for receiving a memory access request from each device, means for detecting the identity of the memory to be accessed, and if two requests request access to different memories, the memory interface is adapted to permit access to said different memories, for example at different times or substantially simultaneously.
Further objectives and advantages of the present invention will become apparent from a careful reading of a detailed description provided hereinbelow, with appropriate reference to accompanying drawings.
Examples of embodiments of the present invention will now be described with reference to the drawings, in which:
It should be understood that the drawings are not necessarily to scale and that the embodiments are sometimes illustrated by graphic symbols, phantom lines, diagrammatic representations and fragmentary views. In certain instances, details which are not necessary for an understanding of the present invention or which render other details difficult to perceive may have been omitted. It should be understood, of course, that the invention is not necessarily limited to the particular embodiments illustrated herein. Like numbers utilized throughout the various Figures designate like or similar parts.
In embodiments of the present invention, the apparatus comprises an interface which is able to service multiple buses at the same time as long as the buses do not operate on the memory in a manner that would be contrary to allowed memory operations. There are numerous ways in which the memory can be implemented to enable the interface to allow a plurality of data buses to operate thereon simultaneously, and non-limiting examples of various implementations are as follows.
(1) The memory may be implemented so that different parts of the memory are capable of operating in different modes at the same time. For example, the interface may be adapted to control one part of the memory for a read operation and another part of the memory for a write operation at the same time. Each part of the memory has an input and output data path and the input and output data paths (buses) may be shared between different parts of the memory or each part of the memory may have a separate input and output data path (bus). However, in this implementation in which the interface permits one read and one write access at the same time, a shared input path and a shared output path is sufficient. In this case, only two data buses are required (one read and one write) and therefore the routing is efficient. In some embodiments, in order to be able to perform two memory operations at the same time, so that one memory element performs one operation and another memory element performs another operation, two address buses may be provided from the memory interface to each memory element. A selector e.g. a 2:1 mux may be provided at the memory address input so that the appropriate address bus can be selectively connected thereto. This allows any two memory elements to be addressed at the same time so that both can perform a read, a write, or one can perform a read and another a write. The selectors may be controlled by the interface, and the WE signal may be used for this purpose.
(2) In another implementation, the interface may be adapted to permit a plurality of read accesses at the same time or a plurality of write accesses at the same time. For example, the memory may comprise a plurality of memory elements each controllable to be placed in read mode or write mode and each memory element may have its own input data bus and/or its own output data bus. In this case, the memory interface may be adapted to permit each system data bus to access an arbitrary address within each memory element in parallel. However, since each memory element has at least one dedicated data bus, the amount of required routing becomes large in large arrays.
(3) In another implementation, in which the memory elements are arranged in an array comprising a plurality of columns of memory elements, the memory interface may be adapted to allow concurrent access to different columns of memory element(s) at the same time. In this case, each column of memory elements would have its own data and address buses so that two or more different columns can be accessed at the same time.
(4) In another implementation, in which the memory elements are arranged in an array comprising a plurality of rows of memory elements, the memory interface may be adapted to permit concurrent access to different rows of memory elements at the same time. In this case, each row of memory elements would have its own data and address buses so that two or more different rows can be accessed at the same time.
(5) In another implementation, the memory interface may be adapted to allow concurrent access to different sub-arrays of memory elements at the same time. For example, in a two-dimensional array of memories, for example a 16×16 array of memories, the interface may be adapted to permit concurrent access to different sub-arrays within the array, for example different two-dimensional sub-arrays. The sub-arrays may be of any size, e.g. 2×2, 4×4, 2×4, 4×2, 8×4, 8×8, etc.
The memory elements within each sub-array may share at least one of the same data bus for write access, the same data bus for read access and the same address bus. The memory elements within the same sub-array may be controlled by at least one common control signal, for example a common memory enable signal which enables or disables all memory elements in the array, a write enable signal, which places all memory elements in the sub-array into write mode, a read enable signal which places all memory elements within the sub-array in read mode, and common high and low byte write enable signals. A sub-array of memory elements may either be completely independently controllable from other sub-arrays or different sub-arrays may share one or more data buses and/or one or more address buses and/or one or more control signals with one or more other sub-arrays.
For example, for completely independently controllable sub-arrays, each sub-array can be enabled or disabled independently of the others and can be independently write enabled or read enabled for access to any address within the array.
In a partially independently controllable sub-array, the sub-array may share the same read and/or write data bus with one or more other sub-arrays, in which case only one sub-array can read or write to the shared data bus at any one time. However, different sub-arrays having shared read and/or write data buses could be controlled by the memory interface so that data can be read from one sub-array and data written to the other, at the same time.
(6) In another implementation, the memory interface may be adapted to allow concurrent access to different sub arrays within a three-dimensional array of memory.
Referring to
In one embodiment, the system bus is used to carry both the memory address data and the data read from or written to memory (i.e. information data). In other words, the same one bit lines of the system bus are used to carry both address and information data and these are transmitted in different cycles or time frames in any order. Thus, for example, in a first cycle, the address and control data are sent to the memory interface and in a following cycle or cycles, the information data is sent. In another embodiment, the system bus may comprise separate dedicated control and data buses so that the information data and address data can be sent in parallel.
In this embodiment, the apparatus comprises a first input data bus 116 connected to the input ports 117, 119 of the first and second memory elements 103, 105 and which is connected to the memory interface 115, and a first output data bus 118 connected to the output ports 121, 123 of the first and second memory elements 103, 105 and to the memory interface 115. Thus, the first input data bus 116 is shared between the first and second memory elements 103, 105 for transferring data from the memory interface 115 to the first and second memory elements 103, 105. Similarly, the first output data bus 118 is shared between the first and second memory elements 103, 105 to transfer data from the first and second memory elements 103, 105 to the memory interface 115.
The apparatus further comprises a second input data bus 120 connected to the data inputs 125, 127 of the third and fourth memory elements 107, 109 and to the memory interface 115, and a second output data bus 122 connected to the data outputs 129, 131 of the third and fourth memory elements 107, 109 and to the memory interface 115. Therefore, in this embodiment, the second data input bus 120 is shared between the third and fourth memory elements 107, 109 to transfer data from the memory interface 115 to the third and fourth memory elements 107, 109. Similarly, the second output data bus 122 is shared between the third and fourth memory elements 107, 109 to transfer data from the memory elements 107, 109 to the memory interface 115.
The memory interface 115 includes a controller for generating control signals for controlling operations of the memory elements and a control bus 135 is connected between the memory interface 115 and each memory element 103, 105, 107, 109 for carrying the control signals. These signals may include a memory enable (ME) signal which controls turning on and off the memory, a write enable (WE) signal which controls the mode of operation of the memory between write mode and read mode, and optionally a byte write enable (BWE) signal which enables a subset of input/output ports of the memory to be selected, so that, for example, data words of variable length can be written into and output from the memory.
First and second sets of address buses 137, 138, 139, 140 are connected between the memory interface 115 and the first and second rows of memory elements, respectively, to control the row selector and possibly a column selector of each memory element. In this embodiment, each set comprises two address buses. A selector 142 (e.g. a 2:1 mux) is provided at the address input of each memory 103, 105, 107, 109 to selectively connect the appropriate address bus thereto. This allows any two memories to be accessed by one or more devices at the same time, for example for simultaneous writes, simultaneous reads or a write and a simultaneous read. It will be appreciated that any number of address buses selectively connectable to each memory element may be provided depending on how many devices are to be permitted to access the memory at the same time, or how many memory accesses are to be permitted at the same time, and the number of address buses may correspond to the number of such devices, or accesses, for example.
The interface may be adapted to generate an individual set of control signals for each memory element so that each memory element is independently controllable. For example, the memory interface may be adapted to generate separate memory enable signals for each memory element so that individual elements can be turned off when not in use to save power, for example. In another implementation, the memory interface may be adapted to generate one or more control signal(s) that are shared between a plurality of memory elements. For example, with reference to
Non-limiting examples of various operations of the memory system and memory interface are described below.
(1) Where the memory interface generates control signals which are common to all memory elements, the memory interface can place all memory elements simultaneously either in read mode or in write mode. In this case, the system buses 111, 113 can either both read data from the memory 102 or both write data to the memory. For example, the memory interface may permit the first system bus 111 to perform a memory read from the first memory group comprising memory elements 103, 105 or from the second memory group comprising memory elements 107, 109 and may simultaneously permit the second system bus 113 to perform a memory read from the other of the two memory groups, i.e. the group not being operated on by the first system bus 111.
Similarly, the memory interface may permit the first and second system buses to perform simultaneous memory write operations where one of the system buses 111, 113 performs a write operation on a memory element of one of the groups and the second system bus performs a write operation on a memory element of the other memory group.
(2) Where the memory interface is capable of generating separate control signals for each memory group so that each memory group can be controlled independently of the other, in addition to the above modes of operation, the memory interface can permit one system (or external) bus to perform a write operation on a memory element of one of the memory groups and at the same time permit another system (or external) bus to perform a memory read operation on a memory element of another memory group.
(3) Where the memory interface is adapted to generate control signals to independently control memory elements of the same group, the memory interface may permit, for example, one of the system buses 111, 113 to perform a write operation on one memory element of a predetermined memory group and at the same time permit another system bus to perform a read operation on another memory element of the same predetermined memory group. Thus, in one specific example, the memory interface 115 may be adapted to permit the first system bus 111 to perform a read access to the first memory element 103 and at the same time permit the second system bus 113 to perform a memory write operation to the second memory element 105, the first and second memory elements belonging to the same memory group.
Thus, the embodiment of
Use Cases for Two System Buses
In a variation of the embodiment of
USE Cases for 3 System Buses
In this implementation, since all memory elements share the same data input bus and all shares the same data output bus, for simultaneous operations, the memory interface is limited to permitting a read operation with a simultaneous write operation. For example, the memory interface may allow one of the system buses 111, 113 to perform a read operation on any one of the memory elements and at the same time permit the other system bus to perform a write operation on any other of the memory elements. In this case, the memory interface is adapted to generate control signals for independently controlling each memory element.
In a more limited implementation, the memory interface may be adapted to generate a common set of control signals for controlling operation of the memory elements of one memory group (e.g. memory elements 103, 105) and a second set of common control signals for controlling the memory elements of another memory group (e.g. the third and fourth memory elements 107, 109). In this case, the memory elements of the same group are all controlled in the same way so that all memory elements are either in read mode or write mode. In one example of this more limited implementation, the memory interface 115 is adapted to generate a common set of control signals for memory elements 103 and 105 and a second set of common control signals for memory elements 107, 109. In this case, the memory interface is limited to permitting a read access from either memory elements 103, 105 and a simultaneous write access to memory elements 107, 109, or a write access to memory elements 103, 105 and simultaneous read access to memory elements 107, 109.
In other embodiments, the memory interface may be adapted to generate control signals for independently controlling any one memory element or any group of memory elements comprising any number of memory elements, as desired or required.
Embodiments of the present invention may be incorporated into a data processor apparatus, in which one or more processor units is coupled to each memory element of the memory. The processor units may be controlled by an array controller. The array controller may be adapted to control the processor units to perform operations in parallel to implement a SIMD (single data multiple instruction) processor. An example of such a system is shown in
Referring to
An example of an implementation of the data processor of
Referring to
The data processor 100 includes an array controller 157 for controlling operations of the processor units 141 to 155, and a control bus 159 for carrying control signals from the array controller to each processor unit. The array controller 157 is also connected to the memory interface 115 and one or more buses may be provided between the array controller and memory interface to carry signals therebetween. In this example, a memory request bus 161 is provided for carrying memory request signals to the memory interface for requesting memory accesses by the processor units. An optional data bus 163 is also provided between the array controller and memory interface for carrying broadcast data from the array controller to the memory 102. Advantageously, the data bus 163 can be used to broadcast data to one or more processor units which removes the need for a dedicated broadcast bus between the array controller and each PU thereby saving routing and chip area, as described in the applicant's co-pending application (attorney docket number 79135-24) identified above.
An example of a memory interface is shown in more detail in
The memory interface arbitrates access to the memory between all of the system buses attached to it. If the processor units are accessing the memory, generally, no other accesses are permitted because the processor units typically use all of the memory elements at the same time. If more devices try to access the memory than are allowed, the memory interface passes the data/address information through to the memory, while the inactive bus or buses wait for their turn. Control signals such as ME, WE and BWE are generated based on the type of access and the input address to the arbiter. For example, if the operation is from the array controller and is a memory store intended for the processor units, i.e. the processor units are controlled to write their data to the memory elements, all of the ME/WE signals will be set to 1. BWE signals may be generated by both the memory interface and by the individual processing elements and these signals may be logically combined (e.g. OR'd together) so that individual processing elements can independently control write operations to its local memory. Similarly, if the operation from the array controller is a memory read, the memory interface will generate ME=1 and WE=0.
If the operation is from an external bus, then the control signals (e.g. ME/WE/BWE) will be generated based on the address and type of access, e.g. read/write. In the example of
Assuming that the embodiment of
The apparatus may comprise any number of system buses and the memory interface may be adapted to control the access of any number of buses to the memory.
The memory may comprise any number of memory elements and the memory elements may be arranged in any manner, for example, as a one-dimensional array, a two-dimensional array or a three-dimensional array.
Each memory element may have any bit width and number of I/Os, for example 2, 4, 8, 16, 32, 64, 128, 256, 1024, or larger.
Where the interface is adapted to permit simultaneous access to the same memory element by different external data buses, different data buses may be permitted access to different memory I/Os in the same row (e.g. all positioned along either the upper edge of the memory, or all positioned along the lower edge of the memory); or in the same column (e.g. all positioned along one side of the memory).
Any aspect, embodiment or feature disclosed or claimed herein may be combined with any aspect, embodiment or feature disclosed in the applicant's co-pending application filed on 29th Apr., 2005, entitled “Data Processor Apparatus and Memory Interface”, attorney docket number 79135-24, the entire contents of which is incorporated herein by reference.
Other aspects or embodiments of the invention comprise any one or more feature disclosed herein in combination with any one or more other feature disclosed herein.
Numerous modifications and changes to the embodiments described above will be apparent to those skilled in the art.
Thus, there has been shown and described several embodiments of a novel invention. As is evident from the foregoing description, certain aspects of the present invention are not limited by the particular details of the examples illustrated herein, and it is therefore contemplated that other modifications and applications, or equivalents thereof, will occur to those skilled in the art. The terms “having” and “including” and similar terms as used in the foregoing specification are used in the sense of “optional” or “may include” and not as “required”. Many changes, modifications, variations and other uses and applications of the present construction will, however, become apparent to those skilled in the art after considering the specification and the accompanying drawings. All such changes, modifications, variations and other uses and applications which do not depart from the spirit and scope of the invention are deemed to be covered by the invention which is limited only by the claims which follow.
This application claims the benefit of U.S. provisional application Ser. No. 60/675,899, filed Apr. 29, 2005 the disclosure of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60675899 | Apr 2005 | US |