Priority Claim. This Application describes technologies that can be used with inventions, and other technologies, described in one or more of the following documents. This application claims priority, to the fullest extent permitted by law, of these documents.
This application claims priority of the following documents, and all documents which those documents incorporate by reference.
This Application also claims priority of the following documents, and all documents which those documents incorporate by reference.
These documents are hereby incorporated by reference as if fully set forth herein. Techniques described in this Application can be elaborated with detail found therein. These documents are sometimes referred to herein as the “Incorporated Disclosures,” the “Incorporated Documents,” or variants thereof.
A portion of the disclosure of this patent document contains material subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This background is provided as a convenience to the reader and does not admit to any prior art or restrict the scope of the disclosure or the invention. This background is intended as an introduction to the general nature of technology to which the disclosure or the invention can be applied.
Communication between computing devices and memories, such as between ASICs and dynamic RAM (DRAM), generally involves at least one communication link for the computing device to direct which memory cell (or group thereof) the memory device should access, and at least one communication link for the computing device to direct what data the memory device should read from or write to that memory cell (or group thereof). For example, the computing device can be coupled to the memory device using (first) a command/address bus, which the computing device can use to direct the memory device whether to read or write data (and with respect to which address location to read or write data), and (second) a data bus, which the computing device can use to specify the actual data the memory device should read or write.
One problem that has arisen in the art is that there is often a substantial delay between when the computing device wishes to read data from the memory, and a response by the memory with its answer. This prompts the computing device to reserve a delayed time slot, on a communication link between the computing device and the memory, to receive data from the memory. In contrast, when the computing device wishes to write data to the memory, the computing device is generally capable of communicating that data to be written right away.
However, when the time slot to write data has already been reserved to read data, the computing device must often wait to communicate the data it wishes to write to the memory. If any substantial number of write operations are delayed, the computing device might have to buffer those write operations, including their write locations (thus, memory addresses) and their data to write, while read operations between the computing device and the memory proceed. The communication link between the computing device and the memory might also be inefficiently used if read operations prevent write operations from being timely placed on the communication link.
This conflict between read operations and write operations can also involve relatively complex circuitry on the computing device to manage the ordering and priority of read operations and write operations. Moreover, when the computing device uses more than one bank of memory, such as when the computing device is coupled to more than one memory device, the circuitry on the computing device can also be involved with managing the selection and ordering of read operations and write operations with respect to multiple memory devices, such as to optimize the bandwidth and latency with respect to those read operations and write operations. Designing and placing circuitry on the computing device can be expensive.
Conflict between read operations and write operations can also involve undesirable use of buffer space and circuitry on the computing device. For example, the computing device might have to maintain a buffer of read operations or write operations. Any such buffer can become longer than desired and can take up a substantial amount of complexity and space on, or power requirements of, the computing device. More specifically, requirements for logic for buffering read operations or write operations can be complex and can have both space and power requirements that designers of the computing device might prefer to use for other functions.
Each of these issues, as well as other possible considerations, might relate to aspects of interactions between computing devices and memory devices, including the ordering and priority of read operations and write operations, the allocation of space and power for circuitry and logic to manage those operations, and related matters.
This summary of the disclosure is provided as a convenience to the reader and does not limit or restrict the scope of the disclosure or the invention. This summary is intended as an introduction to more detailed description found in this Application, and as an overview of techniques explained in this Application. The described techniques have applicability in other fields and beyond the embodiments specifically reviewed in detail.
Among other disclosures, this Application describes a system, and techniques for use, capable of performing both read operations and write operations concurrently with one or more memory devices. In one embodiment, one or more queues and/or buffers for read operations and/or write operations are maintained on the memory device, allowing the computing device to issue concurrent read commands and write commands, without requiring complex ordering or buffering circuitry on the computing device, or at least allowing the computing device to use only relatively simplified ordering or buffering.
In the figures, like references generally indicate similar elements, although this is not strictly required.
After reading this Application, those skilled in the art would recognize that the figures are not necessarily drawn to scale for construction, nor do they necessarily specify any particular location or order of construction.
General Discussion
In one embodiment, a computing device can perform both read operations and write operations concurrently with one or more memory devices.
In one embodiment, one or more queues and/or buffers for read operations and/or write operations are maintained on the memory device, allowing the computing device to issue concurrent read commands and write commands, without requiring complex ordering or buffering circuitry on the computing device, or at least allowing the computing device to use only relatively simplified ordering or buffering.
In one embodiment, when a computing device issues a command for a read operation, the memory device receives the read command and buffers that command in a priority order assigned to that command, such as for relatively immediate processing, or otherwise as specified by a system designer or by the computing device. The memory device can perform the read command and respond with the requested response thereto on a communication link (or “bus”) directed from the memory device to the computing device, such as a data bus disposed to communicate read data. When the memory device performs the read command, it can respond with the read data on the communication link directed from the memory device to the computing device.
In one embodiment, when the computing device issues a command for a write operation, the memory device receives the write command and buffers that command for processing in a priority order for that command, such as for relatively immediate processing, or otherwise as specified by a system designer or by the computing device. The memory device can receive data associated with the write command on a separate communication link (or “bus”) directed from the computing device to the memory device. When the memory device performs the write command, it can respond with an acknowledgement on the communication link directed from the memory device to the computing device, or on a command/acknowledgement bus disposed to communicate acknowledgements.
Because the communication link (sometimes referred to herein as a “communication bus” or a “bus”) from the memory device to the computing device, disposed to communicate read data, is separate from the communication link from the computing device to the memory device, disposed to communicate write data, there is no particular requirement to allocate separate time slots on either communication link for read data and write data. This can have the effect that the computing device and the memory device can communicate read commands/responses and write commands/responses bidirectionally and without substantial delay.
Because the memory device can perform buffering and priority ordering of read operations and write operations without explicit instruction by the computing device, the computing device can operate without relatively complex circuitry to perform that buffering or priority ordering. For example, the computing device can make do with relatively simple buffering or priority ordering (such as in response to multiple requests for read/write operations from distinct subassemblies within the computing device), or in some cases, with almost no buffering or priority ordering circuitry. This can have the effect that the computing device can allocate space for circuitry, wiring, power; reduce the size, cost, or design complexity of the ASIC; or otherwise assign such circuitry, wiring, or power to functions preferred by a designer thereof.
Because the memory device can perform buffering and priority ordering of read operations and write operations without explicit instruction by the computing device, the memory device can operate without instruction by the computing device to optimize read operations and write operations with more than one memory bank. For example, the memory device can reorder or otherwise group read operations and write operations to provide that such operations can be directed to multiple memory banks in a manner that optimizes the number of operations performed by each memory bank, without the computing device having to be involved in any such optimization.
Because the computing device and the memory device can communicate bidirectionally without having to reserve time slots for delayed responses, the communication links between the two devices can operate more efficiently and without substantial wasted capacity. This can have the effects that fewer read/write operations are delayed, and that fewer read/write operations are buffered to wait while other operations are performed.
In one embodiment, the computing device can include an application-specific integrated circuit (ASIC), a processor, a central processing unit (CPU), a graphics processing unit (GPU), a tensor processing unit (TPU), a video processing unit (VPU), or another type of AI or ML device, a cryptography unit, a system-on-a-chip (SoC), a floating-point gate array (FPGA), another type of computing device that can interface with one or more memory devices, or a combination or conjunction of multiple ones or multiple types of such devices. In one embodiment, the memory device can include a dynamic RAM (DRAM), static RAM (SRAM), another type of memory such as a static RAM (SRAM), a synchronous DRAM (SDRAM), a double-data rate SDRAM (DDR), a low-power DDR (LPDDR), a high-bandwidth memory (HBM), a memory cache, a multi-level memory such as including one or more levels of cache and a memory device, a database, another type of memory device that can interface with one or more devices, or a combination or conjunction of multiple ones or multiple types of such devices.
The following terms and phrases are exemplary only, and not limiting.
The phrases “this application”, “this description”, and variants thereof, generally refer to any material shown or suggested by any portions of this Application, individually or collectively, and including all inferences that might be drawn by anyone skilled in the art after reviewing this Application, even if that material would not have been apparent without reviewing this Application at the time it was filed.
The phrases “computing device”, and variants thereof, generally refer to any device (or portion thereof) that might be disposed to issue read and/or write commands to a “memory device”, such as described herein, or multiple ones or multiple types of such memory devices, whether in parallel or series, whether to one or more separate or banks thereof, whether to a distributed or singular such device, whether to a logically local or remote such device, or otherwise as described herein. For example, the computing device can include an application-specific integrated circuit (ASIC), a processor, a processor, a central processing unit (CPU), a graphics processing unit (GPU), a tensor processing unit (TPU), a video processing unit (VPU), or another type of AI or ML device, a cryptography unit, a system-on-a-chip (SoC), a floating-point gate array (FPGA), another type of computing device that can interface with one or more memory devices, a combination or conjunction of multiple ones or multiple types of such devices, or otherwise as described herein.
The phrases “memory device”, and variants thereof, generally refer to any device (or portion thereof) that might be disposed to receive read and/or write commands from a “computing device”, such as described herein, or multiple ones or multiple types of such computing devices, whether in parallel or series, whether to one or more separate or banks thereof, whether to a distributed or singular such device, whether from a logically local or remote such device, or otherwise as described herein. For example, the memory device can include a dynamic RAM (DRAM), static RAM (SRAM), another type of memory such as a static RAM (SRAM), a synchronous DRAM (SDRAM), a double-data rate SDRAM (DDR), a low-power DDR (LPDDR), a high-bandwidth memory (HBM), a memory cache, a multi-level memory such as including one or more levels of cache and a memory device, a database, another type of memory device that can interface with one or more devices, a combination or conjunction of multiple ones or multiple types of such devices, or otherwise as described herein.
The phrases “communication link”, “communication bus”, “bus”, and variants thereof, generally refer to any device (or portion thereof) that might be disposed to send information from a first device to a second device, whether or not that information is retained at the first device, whether or not that information is acknowledged or assured to be received by the second device, whether or not that information undergoes substantial delay or is transmitted by intermediate devices, or otherwise as described herein. For example, a communication link can include an electrical, optical, or electrooptical coupling between the first and second devices, a circuit-switched or packet-switched network including the first and second devices, a redundant modular or otherwise reliable distributed communication system, or otherwise as described herein.
After reviewing this Application, those skilled in the art would recognize that these terms and phrases should be interpreted in light of their context in the specification.
In one embodiment, a system 100 can include one or more of: a computing device 110, a memory device 120, a command/address communication link 130, and a data communication link 140.
The computing device 110 can include an ASIC or another device disposed to issue read and/or write commands to the memory device 120. The memory device 120 can include a DRAM or another device disposed to receive read commands and/or write commands from the computing device 110. The communication links 130 and/or 140 are also sometimes referred to herein as a “bus” or “busses” 130 and/or 140.
In one embodiment, the command/address bus 130 can include one or more time slots 131, each disposed to communicate an address (such as a read address associated with a read command, or a write address associated with a write command). The command/address bus 130 can also include one or more typically much shorter time slots (not shown) associated with specific commands indicating whether the associated command is a read command or a write command and/or acknowledgements of those read commands or write commands.
Separate Read/Write Communication Links
In one embodiment, the data bus 140 can be disposed to logically include a read communication link 141, such as controlled by the memory device 120. The read communication link 141 can include a sequence of read-data time slots 142 each allocated solely to read data to be communicated from the memory device 120 to the computing device 110. The data bus 140 can be disposed to logically include a write communication link 143, controlled by the computing device 110. The write communication link 143 can include a sequence of write-data time slots 144 each allocated solely to write data to be communicated from the computing device 110 to the memory device 120.
As the read communication link 141 is separate from the write communication link 143, read data can be communicated using the read-data time slots 142 on the read communication link 141 concurrently with write data being communicated using the write-data time slots 144 on the write communication link 143. This can have the effect that both the read communication link 141 and the write communication link 143 can operate concurrently. Thus, the computing device 110 and the memory device 120 need not allocate extra read-data time slots 142 or write-data time slots 144.
As described herein, the command/address bus 130 can include one or more (typically much shorter) time slots (not shown), each of which can indicate whether the associated command is a read command or a write command, and each of which can include an acknowledgement of a read command or a write command. In alternative embodiments, acknowledgements of read commands or write commands can be disposed on one or more (typically much shorter) time slots (not shown), on the read communication link 141 or write communication link 143.
As described herein, the read communication link 141 and the write communication link 143 can include separate physical wires suitable for communicating electromagnetic signals (thus, in parallel and possibly concurrently), or can be combined into a single physical wire suitable for communicating multiple electromagnetic signals (thus, possibly concurrently). In the latter case, the single physical wire can be disposed to communicate the multiple electromagnetic signals concurrently or simultaneously, thus, with the effect that read operations and write operations can be performed at the same time.
Read/Write Communication Delays
As the read communication link 141 is separate from the write communication link 143, a (read) delay between issue of a read command on the command/address bus 130 and an associated read-data time slot 142 on the read communication link 141 need not have any particular relationship to a (write) delay between issue of a write command on the command/address bus 130 and an associated write-data time slot 144 on the write communication link 143. Similarly, a (read) delay between issue of a read command and an associated acknowledgement thereof, need not have any particular relationship to a (write) delay between issue of a write command and an associated acknowledgement thereof. However, in practice it is common for a (read) delay between issue of a read command and an associated acknowledgement thereof, and for a (write) delay between issue of a write command and an associated acknowledgement thereof, to each occur on a clock cycle associated with the interface between the computing device 110 and the memory device 120.
While the figure shows a delay of about half of a read-data time slot 142 between issue of a read command from the computing device 110 and a response from the memory device 120 with associated read data, the actual delay would likely be driven by delay incurred by the memory device 120 itself. For example, when the memory device 120 includes a multi-level memory (or when the memory device 120 includes a relatively larger DRAM with a relatively larger number of memory banks or a relatively slower access time), the actual delay incurred by the memory device 120 might depend on whether the requested data was found in a memory cache (a “cache hit”) or whether the requested data had to be requested from a slower and relatively more voluminous non-cache memory device (a “cache miss”). There is no particular requirement that the actual delay is in fact about half of a read-data time slot 142; it might be shorter or longer.
Similarly, while the figure also shows a delay of about half of a write-data time slot 144 between issue of a write command and issue of the associated write data, the actual data would likely be driven by delay incurred by the computing device 110, and possibly also a timing protocol with respect to when the memory device 120 expects write data from the computing device 110. For example, when the computing device 110 issues a write command from a register file, the actual delay incurred by the computing device 110 might depend on a read sensor associated with that register file; in contrast, when the computing device 110 issues a write command from an instruction pipeline, the actual delay incurred by the computing device 110 might depend on a branch prediction circuit or other circuit associated with decoding an instruction associated with that write command. There is no particular requirement that the actual delay is in fact about half of a write-data time slot 144; it might be shorter or longer.
Computing Device Read/Write Commands
In one embodiment, the computing device 110 can include a set of computing circuits 111, disposed to perform the functions of the computing device 110, and a host fabric interface 112, disposed to interface between the computing circuits 111 and a set of communication circuits 113 disposed to communicate with the memory device 120. In one embodiment, the communication circuits 113 can include one or more of:
In one embodiment, when the computing circuits 111 issue a read/write command 114 to the memory device 120, the host fabric interface 112 transmits that read/write command 114 to the communication circuits 113. The communication circuits 113 can receive the read/write command 114 and maintain it in the command queue 113a until fully processed.
In one embodiment, when the read/write command 114 arrives at the command queue 113a, the ordering/transaction optimization circuit 113d can process those read/write commands 114 according to their nature.
As described herein, the computing circuits 113, including the command queue 113a, read queue 113b, write queue 113c, and ordering/transaction optimization circuit 113d can be substantially simplified with respect to a format that might been involved if priority ordering and allocation of read/write time slots were performed with respect to a single read/write communication link and were performed on the computing device 110. Allocating the most difficult functions of the computing circuits 113 to the memory device 120 itself can have the effect that substantial complexity, space, and power requirements can be freed for other uses by the computing device 110. Thus, performing priority ordering and allocation of read/write time slots on the memory device 120 can allow the computing circuits 113 on the computing device 110 to be relatively simpler and to occupy relatively less space and involve use of relatively less power.
For example, when the memory device 120 includes its own processing of read/write commands 114 and its own priority ordering thereof, this can have the effect that the computing device 110 can issue read/write commands 114 to the memory device 120 without substantial concern that those read/write commands 114 involve any particular ordering or any particular allocation of time slots on a single communication link for both read/write commands 114. Read commands can be separately issued, in no particularly required sequence, using the read-data time slots 142 on the read communication link 141, while write commands can be separately issued, in no particularly required sequence, using the write-data time slots 144 on the write communication link 143.
Memory Device Read/Write Commands
In one embodiment, the memory device 120 can include a set of memory circuits 121, disposed to perform the functions of the memory device 120, and a memory command interface 124, disposed to interface between the memory circuits 121 and a set of communication circuits 122 disposed to communicate with the computing device 110.
In one embodiment, the memory device 120 can (optionally) include a set of multiple memory banks 123, such as disposed to operate concurrently in response to read/write commands 114 from the memory circuits 121. For example, when the memory device 120 includes 1 Gigabyte (GB) of memory elements (not shown), those memory elements can be disposed as 1,024 parallel memory banks 123 each being 1 Megabyte (MB) in size. For another example, the memory device 120 can include a different number of memory banks 123 having a different substantially uniform size. For another example, the memory device 120 can include a different number of memory banks 123, not having a substantially uniform size, thus, at least some of which have a substantially different size.
In one embodiment, the memory device 120 can (optionally) include a multilevel memory structure, thus having at least some memory elements (not shown) which are relatively faster and likely relatively more expensive or less numerous, and at least some memory elements (not shown) which are relatively slower and likely relatively less expensive or more numerous. For example, the memory device 120 can include a set of cache elements (not shown) disposed to retain memory elements deemed likely to be more frequently accessed, at least in the near term, and a set of non-cache elements (not shown) disposed to retain memory elements deemed likely to be less frequently accessed, at least in the near term.
In one embodiment, the communication circuits 122 can include one or more of:
In one embodiment, the memory device 120 is coupled to the command/address bus 130 and is coupled to the read communication link 141 and to the write communication link 143.
In one embodiment, when one or more read/write commands 114 are received at one or more of the read queue 122a or the write queue 122b, the ordering/transaction optimization circuit 122c can determine which one or more of those read/write commands 114 is to be given priority order. The ordering/transaction optimization circuit 122c couples the read/write command 114 that is given priority order to the memory circuits 121 to be performed.
For example, when the one or more read/write commands 114 are directed to a set of distinct memory banks 123 associated with the memory device 120, the memory device 120 can assign a priority order to those read/write commands 114 so as to provide that each set of read operations, or each set of write operations, directed to the same memory bank 123, are performed concurrently. This can have the effect that the read/write commands 114 directed to the same memory bank 123 can be performed with relatively greater bandwidth, relatively lesser latency, and/or relatively less power involved in operating the memory bank 123. For example, when a particular memory bank 123 is powered up for one or more read/write commands 114, following accesses (within a selected time duration) to the same memory bank 123 can often involve substantially less power consumption and take substantially less time than if those following accesses were to occur later than that selected time duration.
In one embodiment, when one or more read/write commands 114 are received at one or more of the read queue 122a or the write queue 122b, it might occur that the ordering/transaction optimization circuit 122c determines that both a read command and a write command can be performed concurrently. For example, the memory circuit 121 can perform a read command from a first bank of memory concurrently with performing a write command to a separate and non-overlapping second bank of memory.
As described herein, when one or more queues and/or buffers for read/write commands 114, including possibly read operations and/or write operations, are maintained on the memory device 120, the computing device 110 can issue concurrent read/write commands 114, including possibly read operations and/or write operations, without requiring complex ordering or buffering circuitry on the computing device, or at least allowing the computing device to use only relatively simplified ordering or buffering.
Giving Greater Priority Order to Write Commands than Read Commands
In one embodiment, the memory device 120 can give greater priority to read commands than to write commands, such as to provide speed to a computing device 110 that might otherwise wait for write operations to be completed before performing read operations. In such cases, the ordering/transaction optimization circuit 122c can give greater priority to read/write commands 114 received at the write queue 122b (thus, write commands) than to read/write commands 114 received at the read queue 122a (thus, read commands).
In such cases, the write queue 122b should be emptied by the memory device 120 at a relatively faster rate than the read queue 122a is emptied. This can have the effect that the write queue 122b could be emptied substantially immediately upon receipt of write commands. Thus, all pending write commands would be completed before any pending read commands are performed.
With some computing devices 110, instructions might be retrieved from the memory device 120, such as from an instruction portion of program/data memory, and decoded and performed with as little latency as possible. Such computing devices 110 might reorder performance of those instructions so as to minimize latency, and might even perform instructions speculatively (thus, without knowing for sure whether results of those speculative instructions will actually be used). In such cases, when the results of a particular read operation are dependent upon the results of one or more write operations, the computing device 110 might have to wait for the write operation to be completed before performing the read operation. This can have the effect that pending write operations should be completed before performing pending read operations.
Giving Greater Priority Order to Read than Write Commands
In an alternative embodiment, the memory device 120 can give greater priority to read commands than to write commands, such as to provide speed to a computing device 110 that performs many more read operations than write operations. In such cases, the ordering/transaction optimization circuit 112c can give priority to read/write commands 114 received at the read queue 122a (thus, read commands) over read/write commands 114 received at the write queue 122b (thus, write commands).
In such cases, the read queue 122a should be emptied by the memory device 120 at a relatively faster rate than the write queue 122b is emptied. In ordinary use of many computing devices 110, this might be balanced by the read queue 122a also being filled by the computing device 110 at a similar relatively faster rate than the write queue 122b is filled.
As described herein, because the memory device 120 can perform buffering and priority ordering of read/write commands 114 without explicit instruction by the computing device 110, the computing device 110 can operate without relatively complex circuitry to perform that buffering or priority ordering. For example, the computing device 110 can make do with relatively simple buffering or priority ordering (such as in response to multiple requests for read/write commands 114 from distinct subassemblies within the computing device), or in some cases, with almost no buffering or priority ordering circuitry. This can have the effect that the computing device 110 can allocate space for circuitry, wiring, power, or otherwise, to functions preferred by a designer thereof.
Because the computing device 110 and the memory device 120 can communicate bidirectionally without having to reserve either read time slots 142 or write time slots 144 for delayed responses, the read communication link 141 and the write communication link 143 between the two devices can operate more efficiently and without substantial wasted capacity. This can have the effects that fewer read/write commands 114 are delayed, and that fewer read/write commands 114 are buffered to wait while other operations are performed.
While this Application primarily describes a systems and techniques that primarily relate to communication between a computing device and a memory device, there is no particular requirement for any such limitation. After reading this Application, those skilled in the art will recognize that the techniques described herein are applicable to a wide variety of devices disposed to issue read and/or write commands, and to a wide variety of devices disposed to respond thereto. For example, the techniques described herein are applicable to a wide variety of different types of devices disposed to maintain information and to provide and/or revise that information upon request or command, such as caching systems and multi-level memory systems; file structures, hash tables, and/or graph structures; artificial intelligence or machine learning training systems; or otherwise as described herein.
Moreover, after reading this Application, those skilled in the art will recognize that the techniques described herein are applicable to a wide variety of different types of devices which can communicate using commands and/or data, and a wide variety of different types of data communication, whether communicated using read/write commands or otherwise. For example, the techniques described herein are applicable to a wide variety of different types of devices which can issue commands to request status of a device, such as possibly submitting a query to a database system or to a search engine, or possibly reading status of a hardware control system or a network monitoring system. For another example, the techniques described herein are applicable to a wide variety of different types of devices which can issue commands to command/respond to control status of a device, such as possibly writing to a database system or posting data to a network application, or such as altering status of a hardware control system or a circuit-switching or packet-switching communication network.
This Application describes a preferred embodiment with preferred process steps and, where applicable, preferred data structures. After reading this Application, those skilled in the art would recognize that, where any calculation or computation is appropriate, embodiments of the description can be implemented using general purpose computing devices or switching processors, special purpose computing devices or switching processors, other circuits adapted to particular process steps and data structures described herein, or combinations or conjunctions thereof, and that implementation of the process steps and data structures described herein would not require undue experimentation or further invention.
The claims are incorporated into the specification as if fully set forth herein.
Number | Name | Date | Kind |
---|---|---|---|
4334305 | Girardi | Jun 1982 | A |
5396581 | Mashiko | Mar 1995 | A |
5677569 | Choi | Oct 1997 | A |
5892287 | Hoffman | Apr 1999 | A |
5910010 | Nishizawa | Jun 1999 | A |
6031729 | Berkely | Feb 2000 | A |
6055235 | Blanc | Apr 2000 | A |
6417737 | Moloudi | Jul 2002 | B1 |
6492727 | Nishizawa | Dec 2002 | B2 |
6690742 | Chan | Feb 2004 | B2 |
6721313 | Van Duyne | Apr 2004 | B1 |
6932618 | Nelson | Aug 2005 | B1 |
7027529 | Ohishi | Apr 2006 | B1 |
7248890 | Raghavan | Jul 2007 | B1 |
7269212 | Chau | Sep 2007 | B1 |
7330930 | Nagshain | Feb 2008 | B1 |
7477615 | Oshita | Jan 2009 | B2 |
7535958 | Best | May 2009 | B2 |
7593271 | Ong | Sep 2009 | B2 |
7701957 | Bicknell | Apr 2010 | B1 |
7907469 | Sohn et al. | Mar 2011 | B2 |
7978754 | Yeung | Jul 2011 | B2 |
8004330 | Acimovic | Aug 2011 | B1 |
8024142 | Gagnon | Sep 2011 | B1 |
8121541 | Rofougaran | Feb 2012 | B2 |
8176238 | Yu et al. | May 2012 | B2 |
8468381 | Jones | Jun 2013 | B2 |
8483579 | Fukuda | Jul 2013 | B2 |
8546955 | Wu | Oct 2013 | B1 |
8704364 | Banijamali et al. | Apr 2014 | B2 |
8861573 | Chu | Oct 2014 | B2 |
8948203 | Nolan | Feb 2015 | B1 |
8982905 | Kamble | Mar 2015 | B2 |
9088334 | Chakraborty | Jul 2015 | B2 |
9106229 | Hutton | Aug 2015 | B1 |
9129935 | Chandrasekar | Sep 2015 | B1 |
9294313 | Prokop | Mar 2016 | B2 |
9349707 | Sun | May 2016 | B1 |
9379878 | Lugthart | Jun 2016 | B1 |
9432298 | Smith | Aug 2016 | B1 |
9558143 | Leidel | Jan 2017 | B2 |
9832006 | Bandi | Nov 2017 | B1 |
9843538 | Woodruff | Dec 2017 | B2 |
9886275 | Carlson | Feb 2018 | B1 |
9934842 | Mozak | Apr 2018 | B2 |
9961812 | Suorsa | May 2018 | B2 |
9977731 | Pyeon | May 2018 | B2 |
10171115 | Shirinfar | Jan 2019 | B1 |
10402363 | Long et al. | Sep 2019 | B2 |
10410694 | Arbel | Sep 2019 | B1 |
10439661 | Heydari | Oct 2019 | B1 |
10642767 | Farjadrad | May 2020 | B1 |
10678738 | Dai | Jun 2020 | B2 |
10735176 | Heydari | Aug 2020 | B1 |
10748852 | Sauter | Aug 2020 | B1 |
10769073 | Desai | Sep 2020 | B2 |
10803548 | Matam et al. | Oct 2020 | B2 |
10804204 | Rubin et al. | Oct 2020 | B2 |
10825496 | Murphy | Nov 2020 | B2 |
10826536 | Beukema | Nov 2020 | B1 |
10855498 | Farjadrad | Dec 2020 | B1 |
10935593 | Goyal | Mar 2021 | B2 |
11088876 | Farjadrad | Aug 2021 | B1 |
11100028 | Subramaniam | Aug 2021 | B1 |
11164817 | Rubin et al. | Nov 2021 | B2 |
11204863 | Sheffler | Dec 2021 | B2 |
11481116 | Tavallaei | Oct 2022 | B2 |
11782865 | Kochavi | Oct 2023 | B1 |
11789649 | Chatterjee et al. | Oct 2023 | B2 |
11841815 | Farjadrad | Dec 2023 | B1 |
11842986 | Farjadrad | Dec 2023 | B1 |
11855043 | Farjadrad | Dec 2023 | B1 |
11855056 | F.Rad | Dec 2023 | B1 |
11892242 | Mao | Feb 2024 | B2 |
11893242 | Farjadrad | Feb 2024 | B1 |
11983125 | Soni | May 2024 | B2 |
11989416 | Tavallaei | May 2024 | B2 |
12001355 | Dreier | Jun 2024 | B1 |
12001725 | Chatterjee | Jun 2024 | B2 |
20020122479 | Agazzi | Sep 2002 | A1 |
20020136315 | Chan | Sep 2002 | A1 |
20040088444 | Baumer | May 2004 | A1 |
20040113239 | Prokofiev | Jun 2004 | A1 |
20040130347 | Moll | Jul 2004 | A1 |
20040156461 | Agazzi | Aug 2004 | A1 |
20050041683 | Kizer | Feb 2005 | A1 |
20050134306 | Stojanovic | Jun 2005 | A1 |
20050157781 | Ho | Jul 2005 | A1 |
20050205983 | Origasa | Sep 2005 | A1 |
20060060376 | Yoon | Mar 2006 | A1 |
20060103011 | Andry | May 2006 | A1 |
20060158229 | Hsu | Jul 2006 | A1 |
20060181283 | Wajcer | Aug 2006 | A1 |
20060188043 | Zerbe | Aug 2006 | A1 |
20060250985 | Baumer | Nov 2006 | A1 |
20060251194 | Bublil | Nov 2006 | A1 |
20070281643 | Kawai | Dec 2007 | A1 |
20080063395 | Royle | Mar 2008 | A1 |
20080086282 | Artman | Apr 2008 | A1 |
20080143422 | Lalithambika | Jun 2008 | A1 |
20080186987 | Baumer | Aug 2008 | A1 |
20080222407 | Carpenter | Sep 2008 | A1 |
20090113158 | Schnell | Apr 2009 | A1 |
20090154365 | Diab | Jun 2009 | A1 |
20090174448 | Zabinski | Jul 2009 | A1 |
20090220240 | Abhari | Sep 2009 | A1 |
20090225900 | Yamaguchi | Sep 2009 | A1 |
20090304054 | Tonietto | Dec 2009 | A1 |
20100177841 | Yoon | Jul 2010 | A1 |
20100197231 | Kenington | Aug 2010 | A1 |
20100294547 | Hatanaka | Nov 2010 | A1 |
20110029803 | Redman-White | Feb 2011 | A1 |
20110038286 | Ta | Feb 2011 | A1 |
20110167297 | Su | Jul 2011 | A1 |
20110187430 | Tang | Aug 2011 | A1 |
20110204428 | Erickson | Aug 2011 | A1 |
20110267073 | Chengson | Nov 2011 | A1 |
20110293041 | Luo | Dec 2011 | A1 |
20120082194 | Tam | Apr 2012 | A1 |
20120182776 | Best | Jul 2012 | A1 |
20120192023 | Lee | Jul 2012 | A1 |
20120216084 | Chun | Aug 2012 | A1 |
20120327818 | Takatori | Dec 2012 | A1 |
20130222026 | Havens | Aug 2013 | A1 |
20130249290 | Buonpane | Sep 2013 | A1 |
20130285584 | Kim | Oct 2013 | A1 |
20140016524 | Choi | Jan 2014 | A1 |
20140048947 | Lee | Feb 2014 | A1 |
20140126613 | Zhang | May 2014 | A1 |
20140192583 | Rajan | Jul 2014 | A1 |
20140269860 | Brown | Sep 2014 | A1 |
20140269983 | Baeckler | Sep 2014 | A1 |
20150012677 | Nagarajan | Jan 2015 | A1 |
20150046612 | Gupta | Feb 2015 | A1 |
20150172040 | Pelekhaty | Jun 2015 | A1 |
20150180760 | Rickard | Jun 2015 | A1 |
20150206867 | Lim | Jul 2015 | A1 |
20150271074 | Hirth | Sep 2015 | A1 |
20150326348 | Shen | Nov 2015 | A1 |
20150358005 | Chen | Dec 2015 | A1 |
20160056125 | Pan | Feb 2016 | A1 |
20160071818 | Wang | Mar 2016 | A1 |
20160111406 | Mak | Apr 2016 | A1 |
20160217872 | Hossain | Jul 2016 | A1 |
20160294585 | Rahman | Oct 2016 | A1 |
20170255575 | Niu | Sep 2017 | A1 |
20170286340 | Ngo | Oct 2017 | A1 |
20170317859 | Hormati | Nov 2017 | A1 |
20170331651 | Suzuki | Nov 2017 | A1 |
20180010329 | Golding, Jr. | Jan 2018 | A1 |
20180082981 | Gowda | Mar 2018 | A1 |
20180137005 | Wu | May 2018 | A1 |
20180175001 | Pyo | Jun 2018 | A1 |
20180190635 | Choi | Jul 2018 | A1 |
20180196767 | Linstadt | Jul 2018 | A1 |
20180210830 | Malladi et al. | Jul 2018 | A1 |
20180315735 | Delacruz | Nov 2018 | A1 |
20190044764 | Hollis | Feb 2019 | A1 |
20190058457 | Ran | Feb 2019 | A1 |
20190108111 | Levin | Apr 2019 | A1 |
20190198489 | Kim | Jun 2019 | A1 |
20190267062 | Tan | Aug 2019 | A1 |
20190319626 | Dabral | Oct 2019 | A1 |
20200051961 | Rickard | Feb 2020 | A1 |
20200105718 | Collins et al. | Apr 2020 | A1 |
20200257619 | Sheffler | Aug 2020 | A1 |
20200320026 | Kabiry | Oct 2020 | A1 |
20200364142 | Lin | Nov 2020 | A1 |
20200373286 | Dennis | Nov 2020 | A1 |
20210056058 | Lee | Feb 2021 | A1 |
20210082875 | Nelson | Mar 2021 | A1 |
20210117102 | Grenier | Apr 2021 | A1 |
20210149763 | Ranganathan | May 2021 | A1 |
20210181974 | Ghosh | Jun 2021 | A1 |
20210183842 | Fay | Jun 2021 | A1 |
20210193567 | Cheah et al. | Jun 2021 | A1 |
20210225827 | Lanka | Jul 2021 | A1 |
20210258078 | Meade | Aug 2021 | A1 |
20210311900 | Malladi | Oct 2021 | A1 |
20210365203 | O | Nov 2021 | A1 |
20210405919 | K | Dec 2021 | A1 |
20220051989 | Agarwal | Feb 2022 | A1 |
20220159860 | Winzer | May 2022 | A1 |
20220189934 | Kim | Jun 2022 | A1 |
20220223522 | Scearce | Jul 2022 | A1 |
20220254390 | Gans | Aug 2022 | A1 |
20220350756 | Burstein | Nov 2022 | A1 |
20220391114 | Richter | Dec 2022 | A1 |
20230039033 | Zarkovsky | Feb 2023 | A1 |
20230068802 | Wang | Mar 2023 | A1 |
20230090061 | Zarkovsky | Mar 2023 | A1 |
20230092541 | Dugast | Mar 2023 | A1 |
20230161599 | Erickson | May 2023 | A1 |
20230289311 | Noguera Serra | Sep 2023 | A1 |
20240007234 | Harrington | Jan 2024 | A1 |
20240028208 | Kim | Jan 2024 | A1 |
20240241840 | Im | Jul 2024 | A1 |
Entry |
---|
Block Memory Generator v8.2 LogiCORE IP Product Guide Vivado Design Suite; Xilinx; Apr. 1, 2015; retrieved from https://docs.xilinx.com/v/u/8.2-English/pg058-blk-mem-gen on Jan. 25, 2024 (Year: 2015). |
M. Palesi, E. Russo, A. Das and J. Jose, “Wireless enabled Inter-Chiplet Communication in DNN Hardware Accelerators,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, FL, USA, 2023, pp. 477-483, doi: 10.1109/IPDPSW59300.2023.00081. (Year: 2023). |
K. Drucker et al., “The Open Domain-Specific Architecture,” 2020 IEEE Symposium on High-Performance Interconnects (HOTI), Piscataway, NJ, USA, 2020, pp. 25-32, doi: 10.1109/HOTI51249.2020.00019. (Year: 2020). |
H. Sharma et al., “SWAP: A Server-Scale Communication-Aware Chiplet-Based Manycore PIM Accelerator,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 41, No. 11, pp. 4145-4156, Nov. 2022, doi: 10.1109/TCAD.2022.3197500 (Year: 2022). |
X. Duan, M. Miao, Z. Zhang and L. Sun, “Research on Double-Layer Networks-on-Chip for Inter-Chiplet Data Switching on Active Interposers,” 2021 22nd International Conference on Electronic Packaging Technology (ICEPT), Xiamen, China, 2021, pp. 1-6, doi: 10.1109/ICEPT52650.2021.9567983. (Year: 2021). |
U.S. Appl. No. 16/812,234; Mohsen F. Rad; filed Mar. 6, 2020. |
Farjadrad et al., “A Bunch of Wires (B0W) Interface for Inter-Chiplet Communication”, 2019 IEEE Symposium on High-Performance Interconnects (HOTI), pp. 27-30, Oct. 2019. |
“Hot Chips 2017: Intel Deep Dives Into EMIB”, TomsHardware.com; Aug. 25, 2017. |
“Using Chiplet Encapsulation Technology to Achieve Processing-In-Memory Functions”; Micromachines 2022, 13, 1790; https://www.mdpi.com/journal/micromachines; Tian et al. |
“Multiport memory for high-speed interprocessor communication in MultiCom;” Scientia Iranica, vol. 8, No. 4, pp. 322-331; Sharif University of Technology, Oct. 2001; Asgari et al. |
Universal Chiplet Interconnect Express (UCIe) Specification, Revision 1.1, Version 1.0, Jul. 10, 2023. |
Hybrid Memory Cube Specification 2.1, Hybrid Memory Cube Consortium, HMC-30G-VSR PHY, 2014. |
Quartus II Handbook Version 9.0 vol. 4: SOPC Builder; “System Interconnect Fabric for Memory-Mapped Interfaces”; Mar. 2009. |
Universal Chiplet Interconnect Express (UCIe) Specification Rev. 1.0, Feb. 24, 2022. |
Brinda Ganesh et al., “Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling”, 2007, IEEE, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp. 1-12 (Year: 2007). |
Anu Ramamurthy, “Chiplet Technology & Heterogeneous Integration” Jun. 2021, NASA, 2021 NEPP ETW, slides 1-17 (Year: 2021). |
Wikipedia, “Printed circuit board”, Nov. 9, 2021, Wayback Machine, as preserved by the Internet Archive on Nov. 9, 2021, pp. 1-23 (Year: 2021). |
Kurt Lender et al., “Questions from the Compute Express Link Exploring Coherent Memory and Innovative Cases Webinar”, Apr. 13, 2020, CXL consortium. |
Planet Analog, “The basics of SerDes (serializers/deserializers) for interfacing”, Dec. 1, 2020, Planet Analog. |
Number | Date | Country | |
---|---|---|---|
63190170 | May 2021 | US |