Examples of the present disclosure generally relate to electronic circuits and, in particular, to error correction code (ECC) proxy extension and byte organization for multi-master systems.
Advances in integrated circuit technology have made it possible to embed an entire system, such as including a processor core, a memory controller, and a bus, in a single semiconductor chip. This type of chip is commonly referred to as a system-on-chip (SoC). Other SoCs can have different components embedded therein for different applications. The SoC provides many advantages over traditional processor-based designs. It is an attractive alternative to multi-chip designs because the integration of components into a single device increases overall speed while decreasing size. The SoC is also an attractive alternative to fully customized chips, such as an application specific integrated circuit (ASIC), because ASIC designs tend to have a significantly longer development time and larger development costs. A configurable SoC (CSoC), which includes programmable logic, has been developed to implement a programmable semiconductor chip that can obtain benefits of both programmable logic and SoC.
Error-correcting code (ECC) memory can detect and correct common types of internal data corruption. Typically, ECC memory is immune to single-bit errors or double-bit errors when using single error correct, double error detect (SECDED). Data words read from memory locations are the same as the data words written to the memory locations, even if one or more bits actually stored have been flipped to the incorrect stage. Some types of double-data rate (DDR) memory devices do not support ECC due to architecture and/or cost constraints (e.g., ECC requires additional integrated circuits (ICs) and real estate on the memory module). However, it is still desirable to achieve error correction functionality when using these non-ECC memory devices.
Techniques for error correction code (ECC) proxy extension and byte organization for multi-master systems are described. In an example, a multi-master system in a system-on-chip (SoC) includes: a plurality of master circuits; an error-correcting code (ECC) proxy bridge comprising hardened circuitry in the SoC; a local interconnect configured to couple the plurality of master circuits to the ECC proxy bridge; a memory not having ECC support; and a system interconnect configured to couple the ECC proxy bridge to the memory; wherein the ECC proxy bridge is configured to establish an ECC proxy region in the memory and, for each write transaction from the plurality of master circuits that targets the ECC proxy region, calculate and insert ECC bytes into the respective write transaction.
In another example, a programmable integrated circuit (IC) includes: a processing system; programmable logic; a plurality of master circuits disposed in the processing system, the programmable logic, or both the processing system and the programmable logic; an error-correcting code (ECC) proxy bridge comprising hardened circuitry in the programmable IC; a local interconnect configured to couple the plurality of master circuits to the ECC proxy bridge; a memory controller configured to interface with a memory that does not have ECC support; a system interconnect configured to couple the ECC proxy bridge to the memory controller; wherein the ECC proxy bridge is configured to establish an ECC proxy region in the memory and, for each write transaction from the plurality of master circuits that targets the ECC proxy region, calculate and insert ECC bytes into the respective write transaction.
In another example, a method of communication between a plurality of master circuits and a memory in a system-on-chip (SoC) includes: establishing, by an error-correcting code (ECC) proxy bridge comprising hardened circuitry in the SoC, an ECC proxy region in a memory; receiving write transactions from a plurality of master circuits for the memory, the write transactions targeting the ECC proxy region in the memory; calculating, for each write transaction, ECC bytes for bytes of the respective write transaction at the ECC proxy bridge; inserting, by the ECC proxy bridge, the ECC bytes into the respective write transactions; and forwarding the write transactions from the ECC proxy bridge to the memory.
These and other aspects may be understood with reference to the following detailed description.
So that the manner in which the above recited features can be understood in detail, a more particular description, briefly summarized above, may be had by reference to example implementations, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical example implementations and are therefore not to be considered limiting of its scope.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements of one example may be beneficially incorporated in other examples.
Various features are described hereinafter with reference to the figures. It should be noted that the figures may or may not be drawn to scale and that the elements of similar structures or functions are represented by like reference numerals throughout the figures. It should be noted that the figures are only intended to facilitate the description of the features. They are not intended as an exhaustive description of the claimed invention or as a limitation on the scope of the claimed invention. In addition, an illustrated example need not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular example is not necessarily limited to that example and can be practiced in any other examples even if not so illustrated or if not so explicitly described.
The NMUs 202 are traffic ingress points. The NSUs 204 are traffic egress points. Endpoint circuits coupled to the NMUs 202 and NSUs 204 can be hardened circuits (e.g., hardened circuits 110) or circuits configured in programmable logic. A given endpoint circuit can be coupled to more than one NMU 202 or more than one NSU 204.
The network 214 includes a plurality of physical channels 306. The physical channels 306 are implemented by programming the NoC 106. Each physical channel 306 includes one or more NoC packet switches 206 and associated routing 208. An NMU 202 connects with an NSU 204 through at least one physical channel 306. A physical channel 306 can also have one or more virtual channels 308.
Connections through the network 214 use a master-slave arrangement. In an example, the most basic connection over the network 214 includes a single master connected to a single slave. However, in other examples, more complex structures can be implemented.
In the example, the PS 104 includes a plurality of NMUs 202 coupled to the HNoC 404. The VNoC 402 includes both NMUs 202 and NSUs 204, which are disposed in the PL regions 102. The memory interfaces 406 include NSUs 204 coupled to the HNoC 404. Both the HNoC 404 and the VNoC 402 include NPSs 206 connected by routing 208. In the VNoC 402, the routing 208 extends vertically. In the HNoC 404, the routing extends horizontally. In each VNoC 402, each NMU 202 is coupled to an NPS 206. Likewise, each NSU 204 is coupled to an NPS 206. NPSs 206 are coupled to each other to form a matrix of switches. Some NPSs 206 in each VNoC 402 are coupled to other NPSs 206 in the HNoC 404.
Although only a single HNoC 404 is shown, in other examples, the NoC 106 can include more than one HNoC 404. In addition, while two VNoCs 402 are shown, the NoC 106 can include more than two VNoCs 402. Although memory interfaces 406 are shown by way of example, it is to be understood that other hardened circuits can be used in place of, or in addition to, the memory interfaces 406.
Each ECC proxy bridge 510, 512 is a hardened circuit in the SoC 100 (e.g., not a circuit configured in the PL 102). Implementing ECC proxy bridges 510, 512 as hardened circuits improves latency in the multi-master system 500 and conserves programmable logic to use for other functions.
In the example, the master circuit 502 accesses the memory 516 through the ECC proxy bridge 510. The master circuit 504 and the master circuit 506 access the memory 516 through the ECC proxy bridge 512. Each master circuit 502-506 has the same view of the memory 516. Each ECC proxy bridge 510, 512 provides an ECC proxy that allows a region of the memory 516 to be defined where (1) transactions consume twice the memory space; (2) transactions consume twice as much bandwidth going to the memory 516; but where (3) the address and data of each transaction has ECC protection. While two ECC proxy bridges 510, 512 are shown in the example, in general the multi-master system 500 can include one or more ECC proxy bridges. An ECC proxy provides a single address match region for ECC proxy data. This region is configured during boot time and is configured identically across each ECC proxy bridge 510, 512. For transactions that do not match the defined ECC proxy region, the ECC proxy bridges 510, 512 pass those transactions unaffected without further processing.
From the point-of-view of the master circuits 502-506, the ECC proxy region appears as a linear contiguous memory region that is 0.5 GB in size (e.g., in the layout 600). At the memory 516, the ECC proxy region requires 1 GB of address space. The ECC proxy bridges 510, 512 handle (1) identifying transactions that target the ECC proxy region and (2) inserting and interleaving ECC bytes into the data. The ECC proxy bridges 510, 512 perform the reverse process on the data that is read back from the ECC proxy region of the memory 516. An example technique for interleaving ECC bytes with data is described below.
The ECC proxy bridge 510 or 512 can calculate the ECC information beginning at step 810, where the ECC proxy bridge 510 or 512 determines an address of a byte. At step 812, the ECC proxy bridge 510 or 512 pads any undefined bits of the address with zero. At step 814, the ECC proxy bridge 510 or 512 concatenates the address of the byte and the data of the byte. At step 816, the ECC proxy bridge 510 or 512 calculates ECC for the concatenated address and data to create codeword bits. At step 818, the ECC proxy bridge 510 or 512 inserts the codeword bits in an outgoing ECC field for the byte. The ECC proxy bridge 510 or 512 performs steps 810 through 818 for each byte in the write transaction. At step 820, the ECC proxy bridge 510 or 512 sends the modified write transaction to the memory 516.
At step 908, the ECC proxy bridge 510 or 512 stores the address of the read transaction. At step 910, the ECC proxy bridge 510 or 512 receives the data from the memory 516 for the read transaction. At step 912, the ECC proxy bridge 510 or 512 concatenates the stored address and the returned data. At step 914, the ECC proxy bridge 510 or 512 performs ECC checking of the concatenated address and data.
Referring to the PS 2, each of the processing units includes one or more central processing units (CPUs) and associated circuits, such as memories, interrupt controllers, direct memory access (DMA) controllers, memory management units (MMUs), floating point units (FPUs), and the like. The interconnect 16 includes various switches, busses, communication links, and the like configured to interconnect the processing units, as well as interconnect the other components in the PS 2 to the processing units.
The OCM 14 includes one or more RAM modules, which can be distributed throughout the PS 2. For example, the OCM 14 can include battery backed RAM (BBRAM), tightly coupled memory (TCM), and the like. The memory controller 10 can include a DRAM interface for accessing external DRAM. The peripherals 8, 15 can include one or more components that provide an interface to the PS 2. For example, the peripherals 132 can include a graphics processing unit (GPU), a display interface (e.g., DisplayPort, high-definition multimedia interface (HDMI) port, etc.), universal serial bus (USB) ports, Ethernet ports, universal asynchronous transceiver (UART) ports, serial peripheral interface (SPI) ports, general purpose 10 (GPIO) ports, serial advanced technology attachment (SATA) ports, PCIe ports, and the like. The peripherals 15 can be coupled to the MIO 13. The peripherals 8 can be coupled to the transceivers 7. The transceivers 7 can include serializer/deserializer (SERDES) circuits, MGTs, and the like.
In some FPGAs, each programmable tile can include at least one programmable interconnect element (“INT”) 43 having connections to input and output terminals 48 of a programmable logic element within the same tile, as shown by examples included at the top of
In an example implementation, a CLB 33 can include a configurable logic element (“CLE”) 44 that can be programmed to implement user logic plus a single programmable interconnect element (“INT”) 43. A BRAM 34 can include a BRAM logic element (“BRL”) 45 in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured example, a BRAM tile has the same height as five CLBs, but other numbers (e.g., four) can also be used. A DSP tile 35 can include a DSP logic element (“DSPL”) 46 in addition to an appropriate number of programmable interconnect elements. An 10B 36 can include, for example, two instances of an input/output logic element (“IOL”) 47 in addition to one instance of the programmable interconnect element 43. As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 47 typically are not confined to the area of the input/output logic element 47.
In the pictured example, a horizontal area near the center of the die (shown in
Some FPGAs utilizing the architecture illustrated in
Note that
While the foregoing is directed to specific examples, other and further examples may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Number | Name | Date | Kind |
---|---|---|---|
6781407 | Schultz | Aug 2004 | B2 |
7199608 | Trimberger | Apr 2007 | B1 |
7301822 | Walstrum, Jr. et al. | Nov 2007 | B1 |
7328335 | Sundararajan et al. | Feb 2008 | B1 |
7420392 | Schultz et al. | Sep 2008 | B2 |
7576561 | Huang | Aug 2009 | B1 |
7620875 | Nelson | Nov 2009 | B1 |
7653820 | Trimberger | Jan 2010 | B1 |
7689726 | Sundararajan et al. | Mar 2010 | B1 |
8341503 | Yoon | Dec 2012 | B1 |
9152794 | Sanders et al. | Oct 2015 | B1 |
9165143 | Sanders et al. | Oct 2015 | B1 |
9230112 | Peterson et al. | Jan 2016 | B1 |
9323876 | Lysaght et al. | Apr 2016 | B1 |
9336010 | Kochar | May 2016 | B2 |
9411688 | Poolla et al. | Aug 2016 | B1 |
9632869 | Lu | Apr 2017 | B1 |
9652252 | Kochar et al. | May 2017 | B1 |
9652410 | Schelle et al. | May 2017 | B1 |
20040114609 | Swarbrick et al. | Jun 2004 | A1 |
20040210695 | Weber et al. | Oct 2004 | A1 |
20080320255 | Wingard et al. | Dec 2008 | A1 |
20080320268 | Wingard et al. | Dec 2008 | A1 |
20120036296 | Wingard et al. | Feb 2012 | A1 |
20170140800 | Wingard et al. | May 2017 | A1 |
20180032394 | Quach | Feb 2018 | A1 |
Entry |
---|
U.S. Appl. No. 15/936,916, filed Mar. 27, 2018, Swarbrick, I.A., et al., San Jose, CA USA. |
U.S. Appl. No. 15/588,321, filed May 5, 2017, Camarota, R., et al., San Jose, CA USA. |
U.S. Appl. No. 16/041,473, filed Jul. 20, 2018, Swarbrick, Ian A., et al., San Jose, CA USA. |
U.S. Appl. No. 15/990,506, filed May 25, 2018, Swarbrick, Ian A., et al., San Jose, CA USA. |
U.S. Appl. No. 15/964,901, filed Apr. 27, 2018, Swarbrick, Ian A., San Jose, CA USA. |
Xilinx, Inc., “Zynq-7000 AP SoC—32 Bit DDR Access with ECC Tech Tip”, 15 pages, printed on Aug. 10, 2018, http://www.wiki.xilinx.com/Zynq-7000+APF+SoC+-+32+Bit+DDR+Access+with+ECC—Tech+Tip, San Jose, CA USA. |