Virtualization facilitates shared memory access among several different devices, whereby a memory interconnect interfaces with the devices using virtual addresses, which are translated to physical addresses of the memory. To enable virtualization, a memory management unit maintains an index of physical and virtual addresses. During a memory access operation, the memory management unit translates a virtual address to a physical address, and returns the physical address in order to access the memory. This translation can occur bi-directionally such that the virtual address is maintained for communications at the device, and the physical address is indicated in operations at the memory.
Example embodiments of the present disclosure include a circuit configured to manage and enforce order among multiple independent threads of requests to a memory. The circuit may include a device interface and a memory interface operated by a control circuit. The device interface may operate to receive a plurality of access requests to access a memory from a plurality of devices, and parse each of the access requests to retrieve a respective transaction identifier (TID). The circuit may update a plurality of ordered lists (also referred to as “linked lists”) having entries corresponding to the plurality of access requests, where each of the ordered lists corresponds to a distinct transaction identifier. The circuit may also maintain a top list, which is an ordered list including entries from each of the plurality of ordered lists. The control circuit, via the memory interface, may then forward the access requests to the memory in an order corresponding to the top list. The circuit may forward the access requests of a common TID in the order corresponding to the ordered list, while forwarding access requests having different TIDs independent of order.
In further embodiments, a translation circuit may operate to translate a virtual address component of each of the access requests to a corresponding physical address of the memory. The translated physical address can be updated to a corresponding entry of the top list or an ordered list. The circuit may populate the top list based on an indication of which of the access requests have been updated with a physical address. Alternatively, the top list may be populated independent of this indication, while the access requests are instead forwarded based on this indication. The circuit may populate the top list with entries from each of the ordered lists in a predetermined selection process, such as a round-robin selection. The circuit may further remove an entry from the top list upon forwarding a corresponding access request to the memory.
The foregoing will be apparent from the following more particular description of example embodiments of the disclosure, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the present disclosure.
A description of example embodiments follows.
The data processor 100 facilitates operations between a number of devices and resources, and arbitrates access to shared resources among the devices. In particular, the processor cores 150 may include one or more data processor cores. In an example embodiment, the processor cores 150 may include a number (e.g., 48) of ARM® processor cores, such as the ARMv8 processor cores. The processor cores 150 are connected, via a coherent memory interconnect (CMI) 135, to provide shared access to a number of other devices and resources, including the level-2 cache (L2C) and controller 160 (also referred to herein as “L2C”). The L2C further connects to a memory controller 165 for performing memory access operations to an external memory, such as a double data rate synchronous dynamic random-access memory (DDR SDRAM) array. Such a memory (not shown) may alternatively be located on-chip with the data processor 100. The CMI 135 may also connect to a coherent processor interconnect (CPI) 155 for communication with off-chip devices, such as an additional data processor. An example of one such configuration is described below with reference to
The CMI 135 is further connected to an input/output bridge (IOBN) 110, which provides an interconnect between the processor cores 150, CPI 155 and L2C 160 and additional devices and resources. In particular, devices 145A-F connect to the IOBN 110 via input/output interconnects (IOI), IO10155A and IOI1155B, which may be non-coherent buses (NCBs) including passive and/or arbitrated channels. The devices 145A-F may include a number of different on-chip devices, such as co-processors, and may include I/O interfaces (e.g., USB, SATA, PCIe, Ethernet) to connect to a number of external or off-chip devices and interfaces. In order to arbitrate resources at the IOBN 110 to the devices 145A-F, NCB arbiters 140A-B receive requests from the devices 145A-F and selectively grant IOBN resources to the devices 145A-B. Once granted, the devices 145A-B may communicate with the processor cores 150, perform a memory access operation to the L2C 160, or access other components of the data processor 100.
In order to facilitate shared memory access among several different devices (e.g., the processor cores 150 and devices 145A-F), the data processor 100 may employ virtualization, whereby a memory interconnect (e.g., CMI 135 and IOBN 110) interfaces with the devices using virtual addresses, which are translated to a physical address of the memory. To enable virtualization, a System Memory Management Unit (SMMU) 180 maintains an index of physical and virtual addresses. During a memory access operation where a virtual address is provided, the IOBN 110 forwards the virtual address to the SMMU 180, which returns a corresponding physical address for accessing the memory (e.g., the L2C 160 or an external memory via the L2C 160). The IOBN 110 may translate addresses bi-directionally such that the virtual address is maintained at communications at the device, and the physical address is indicated in operations at the memory. The SMMU 180 may be further configured to support multiple tiers of virtual addresses.
Control status registers (CSRs) 170 include registers for maintaining information about the instructions and operations of the data processor 100. The CSRs may maintain, for example, status information regarding a number of devices, as well as information about ongoing operations and instructions between devices and/or resources. Devices such as the processor cores 150 and the devices 145A-B, as well as other requestors 185 and off-chip devices (via the CPI 155), may write to and read from the CSRs 170 using a register master logic (RML). To facilitate the multiple requests from several different devices, a master RML (MRML) 120 operates to arbitrate access to the CSRs 170.
The data processors 205A-B may be connected to respective memory arrays (e.g., DDR SDRAM) 215A-B as shown, and/or may be connected to a common memory array. The data processors may be further connected to a number of external devices 245 via a number of devices via respective I/O interfaces (e.g., USB, SATA, PCIe, Ethernet).
Turning back to
An IOBN 110, in one embodiment, may be configured to control access to a memory by a number of devices and maintain an order of access requests under virtualization. The IOBN 110 may manage and enforce order among multiple independent threads of requests to a memory. To do so, the IOBN 110 may populate a number of ordered lists with received access requests based on a corresponding identifier of each access request. The IOBN 110 may also maintain a top list, which is populated with access requests and a corresponding translated physical address. The IOBN 110 may then selectively forward access requests from the top list, maintaining the order of each of the independent threads.
An example IOBN 110 configured to provide the aforementioned functions is described below with reference to
The IOBN 110 includes a non-coherent bus (NCB) interface 355 for communicating with the devices 145A-F via intermediary NCBs, IO10155A and IO11155B. The IOBN 110 also includes a CMI interface 330 for communicating with the L2C 160 via the CMI 135. The IOBN 110 further includes a control circuit 320 and content addressable memory (CAM), including an IOBN input CAM (IIC) 340, and an IOBN request output (IXO) 350. Alternatively, the IIC 340 and IXO 350 may be located separately from the IOBN 110.
The devices 145A-F may forward memory access requests to the L2C 160 via the IOBN 110, for example to read or write to the L2C 160. The IIC 340 stores a plurality of ordered lists (also referred to as “linked lists”) that maintains access requests of a common type in a specified order, such as in the order in which the access requests were sent from a device. In one example, each device 145A-F may be assigned a set of one or more unique transactions IDs (TIDs). The IIC 340 maintains a separate, ordered list for each TID, and adds each received access request to the respective list based on its TID. Thus, each device 145A-C can maintain order among a particular thread of access requests by assigning those requests a common TID. Conversely, unrelated access requests that do not require a specific order (i.e., can be completed in any order) can be assigned different TIDs, enabling the requests to be sent independently of one another. Alternatively, if requests among two or more of the devices 145A-F must be sent to the L2C 160 in a given order, then the two or more devices 145A-F may be assigned one or more common TIDs. Example structures of ordered lists at the IIC are described below with reference to
The control circuit 320 may operate to populate the IIC 340 with received access requests based on their respective transaction ID as described above. Further, the control circuit 350 may forward the access requests to the SMMU 180 for virtual-to-physical address translation, and may selectively populate the IXO 350. The IXO 350 may maintain a single “top” list of access requests for forwarding to the L2C 160. An example structure of a top list maintained by the IXO 350 is described below with reference to
To select a next access request to forward to the L2C 160, the IOBN may select from among the entries in the IXO 350 (e.g., in a round-robin fashion) that have a corresponding physical address. Thus, when the given access request is considered, if the IOBN 110 has received its corresponding physical address from the SMMU 180 (730), then the IOBN 110 forwards the access request 740 to the L2C 160 (740) and clear the request from the top list at the IXO 350 (745). If the access request does not yet have a physical address when considered, the IOBN 110 may skip the request and reconsider the request in a subsequent selection round. Alternatively, the operations of updating the top list (725) and checking a physical address (730) may be reversed, such that the access request is added to the top list only upon receiving a physical address.
While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.