The technology of the disclosure relates generally to Peripheral Component Interconnect (PCI) express (PCIe) systems.
Mobile communication devices have become increasingly common in modern society. The increasing popularity of such mobile communication devices is driven, in part, by the increased functionality available on these devices. Such increased functionality is enabled by the inclusion of ever more complex integrated circuits (ICs) within the mobile communication devices. As the number and complexity of the ICs within the mobile communication devices has increased, so has the need for the various ICs to communicate with one another.
Several standards have been published outlining various protocols that allow ICs to communicate with one another. A popular protocol is the Peripheral Component Interconnect (PCI) protocol, which comes in various flavors, including the PCI express (PCIe) protocol. While useful as IC to IC communication protocols, the PCI and PCIe protocols may also be used to couple a mobile terminal to a remote device through a cable or other connector.
The PCIe protocol is frequently used to control access to memory elements. In many instances more than one PCIe component may want to access the memory elements concurrently. In such instances, such access requests are sent to a system memory (or device memory) to read/write data. However, PCIe is defined as not coherent. That is, modifications to the system memory (or the device memory) are not automatically communicated to other PCIe components. In short, it may be difficult to manage and control access to the memory elements correctly. Thus, there needs to be a better mechanism through which such concurrent use of memory resources is managed.
Aspects disclosed in the detailed description include coherency driven enhancements to a Peripheral Component Interconnect (PCI) express (PCIe) transaction layer. In an exemplary aspect, a coherency agent is added to a PCIe system to support a relaxed consistency model for use of memory in the PCIe system. The PCIe system may include a system memory element with data stored therein. Rather than require endpoints of the PCIe system to read from and write to the system memory element, exemplary aspects of the present disclosure allow the endpoints to request ownership of portions of the system memory element. Such portions may be defined by an address range of the system memory element. The coherency agent assigns a requested address range to a requesting endpoint. This assignment may sometimes be referred to as assigning ownership. The requesting endpoint copies contents of the system memory element corresponding to the assigned address range into local endpoint memory. The requesting endpoint then performs local read and write operations on the copied memory contents. The owning endpoint may send an updated snapshot of the copied memory contents (as updated by any local write operations) if requested by a root complex or other endpoint. At completion of use of the copied memory contents by the endpoint or after instruction from the coherency agent of the root complex, the ownership of the address range reverts back to the root complex, and the endpoint sends updated contents back to the address range in the system memory element.
In this regard, in one aspect, a method for controlling a host memory in a PCIe system is provided. The method includes receiving, at a root complex of a host associated with a host memory in the host, a request from a first endpoint for access to a first portion of data stored in the host memory. The method further includes requesting, to a coherency agent of the host, an ownership of an address range associated with the first portion of the data from the host. The method further includes assigning, by the coherency agent, the ownership of the address range from the host to the first endpoint and providing data associated with the address range to the first endpoint. The method further includes receiving, from the first endpoint, modified data associated with the address range when the ownership of the address range returns to the host.
In another aspect, a host system of a PCIe system is provided. The host system includes a PCIe bus interface configured to be coupled to at least a first endpoint and a second endpoint through a PCIe bus. The host system further includes a host memory comprising data stored therein, at least a first portion of the data and a second portion of the data associated with an address range. The host system further includes a root complex associated with the host memory, configured to receive a request for ownership of the first portion of the data associated with the address range from the first endpoint from the PCIe bus. The host system further includes a coherency agent configured to control ownership of the address range.
In another aspect, a method for managing data in an endpoint of a PCIe system is provided. The method includes requesting, by a first endpoint to a root complex associated with a host memory, access to a portion of data stored in the host memory. The method further includes receiving, from the root complex, data associated with an address range and ownership of the address range. The method further includes storing, at a local memory of the first endpoint, the data associated with the address range. The method further includes providing, to the root complex, modified data associated with the address range in response to the ownership of the address range returning to a host system.
In another aspect, an endpoint of a PCIe system is provided. The endpoint includes a local memory. The endpoint also includes processing circuitry coupled to the local memory. The processing circuitry of the endpoint is configured to request, to a root complex associated with a host memory of a PCIe system, access to a portion of data stored in the host memory. The processing circuitry of the endpoint is further configured to receive, from the root complex, data associated with an address range and ownership of the address range. The processing circuitry of the endpoint is further configured to store, at the local memory of the endpoint, the data associated with the address range. The processing circuitry of the endpoint is further configured to provide, to the root complex, modified data associated with the address range in response to the ownership of the address range returning to the PCIe system.
In another aspect, a host system of a PCIe system is provided. The host system includes a means for interfacing with at least a first endpoint and a second endpoint through a PCIe bus. The host system further includes a means for storing data, at least a first portion of the data and a second portion of the data associated with an address range. The host system further includes a means for processing data ownership requests for the data stored in the means for storing data, configured to receive a request for ownership of the first portion of the data associated with the address range from the first endpoint from the PCIe bus. The host system further includes a means for controlling memory configured to control ownership of the address range.
In another aspect, a PCIe system is provided. The PCIe system includes a host system, including a PCIe bus interface configured to be coupled to at least an endpoint of a PCIe system through a PCIe bus. The host system further includes a host memory including data stored therein, at least a portion of the data associated with an address range. The host system further includes a root complex associated with the host memory, configured to receive a request for ownership of the portion of the data associated with the address range from the endpoint from the PCIe bus. The host system further includes a coherency agent configured to control ownership of the address range.
The PCIe system further includes the endpoint, including a local memory and processing circuitry configured to request, to the root complex, access to the portion of the data stored in the host memory. The processing circuitry is further configured to receive, from the root complex, the data associated with the address range and the ownership of the address range. The processing circuitry is further configured to store, at the local memory, the data associated with the address range. The processing circuitry is further configured to provide, to the root complex, modified data associated with the address range in response to the ownership of the address range returning to the host system.
With reference now to the drawing figures, several exemplary aspects of the present disclosure are described. The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.
Aspects disclosed in the detailed description include coherency driven enhancements to a Peripheral Component Interconnect (PCI) express (PCIe) transaction layer. In an exemplary aspect, a coherency agent is added to a PCIe system to support a relaxed consistency model for use of memory in the PCIe system. The PCIe system may include a system memory element with data stored therein. Rather than require endpoints of the PCIe system to read from and write to the system memory element, exemplary aspects of the present disclosure allow the endpoints to request ownership of portions of the system memory element. Such portions may be defined by an address range of the system memory element. The coherency agent assigns a requested address range to a requesting endpoint. This assignment may sometimes be referred to as assigning ownership. The requesting endpoint copies contents of the system memory element corresponding to the assigned address range into local endpoint memory. The requesting endpoint then performs local read and write operations on the copied memory contents. The owning endpoint may send an updated snapshot of the copied memory contents (as updated by any local write operations) if requested by a root complex or other endpoint. At completion of use of the copied memory contents by the endpoint or after instruction from the coherency agent of the root complex, the ownership of the address range reverts back to the root complex, and the endpoint sends updated contents back to the address range in the system memory element.
Before discussing exemplary aspects of coherency driven enhancements to a PCIe transaction layer, a brief overview of a conventional PCIe system is first provided in
In this regard,
With continued reference to
According to the PCIe protocol, TLPs are used to communicate transactions, such as read and write, as well as certain types of events, between the PCIe RC 120, and the plurality of PCIe endpoints 104(1)-104(M) and the PCIe switch 108. The PCIe protocol defines four (4) types of transactions, including memory transactions, input/output (I/O) transactions, configuration transactions, and message transactions. The memory transactions include Read Request, Write Request, and AtomicOp request transactions. For the memory transactions, PCIe is defined as not coherent. That is, modifications to the memory 118, for example, are not automatically communicated to other PCIe components such as the plurality of PCIe endpoints 104(1)-104(M). Thus, it may be difficult to manage and control access to the memory 118 from PCIe components such as the plurality of PCIe endpoints 104(1)-104(M), for example.
In this regard,
The host system 202 includes several elements similar to those described above with respect to the conventional host system 102 illustrated in
In exemplary aspects of the present disclosure, the relaxed consistency model is implemented when an endpoint, such as one of the plurality of PCIe endpoints 218(1)-218(M) may desire to read from and write to a portion of the host memory 214. Absent exemplary aspects of the present disclosure, each time the endpoint wants to read from or write to the host memory 214, corresponding messages must cross the PCIe bus 220. Exemplary aspects of the present disclosure eliminate these messages, thus reducing message traffic on the PCIe bus 220 by using the coherency agent 204 to assign ownership of an address range in which the desired portion of the contents of the host memory 214 is stored to a requesting endpoint (for the sake of example, PCIe endpoint 218(1)). Such reduction in the message traffic on the PCIe bus 220 may reduce latency in general on the PCIe bus 220 since there is greater bandwidth available for other messages. Once the ownership is assigned in this fashion, the PCIe endpoint 218(1) copies data stored in the address range, and therefore the desired portion of the host memory 214, to a local memory 222(1). The PCIe endpoint 218(1) may access the desired portion of the data in the address range faster by accessing the local memory 222(1) rather than having to communicate through the PCIe bus 220 to access the desired portion of the host memory 214 in the address range. The PCIe endpoint 218(1) may then perform read and write operations on the copied data until either the PCIe endpoint 218(1) completes its need for the copied data or the PCIe RC 206 requests the ownership back from the PCIe endpoint 218(1).
In this regard, the PCIe endpoint 218(1) may request ownership of the address range 214(H)-214(K) through signal 402 (
With continued reference to
With continued reference to
At another time, the PCIe endpoint 218(N) may need to write to the address range 214(H)-214(K). A write request (signal 424) is sent to the PCIe RC 206. The PCIe RC 206 responds with a query to the coherency agent 204 of the status of the address range 214(H)-214(K) through signal 426. The coherency agent 204 responds by informing the PCIe RC 206 to return the ownership of the address range 214(H)-214(K) to the PCIe RC 206 (signal 428). The PCIe RC 206 then commands the PCIe endpoint 218(1) to return the ownership of the address range 214(H)-214(K) to the PCIe RC 206 (signal 430). The PCIe endpoint 218(1) then writes the data H′-K′ to the host memory 214 (signal 432) to update the host memory 214 with the changes made in the address range 214(H)-214(K) by the PCIe endpoint 218(1). Note that while the example assumes all data H-K is rewritten as H′-K′, the present disclosure is not so limited. For example, the PCIe endpoint 218(1) could return H′, I, J′, and K′, H, I′-K′, or any other combination of old and new values instead of H′-K′ depending on the changes actually made at the PCIe endpoint 218(1).
Thus, as illustrated in state 308 of
It is also possible, although not illustrated, that the PCIe endpoint 218(1) returns the ownership of the address range 214(H)-214(K) when the PCIe endpoint 218(1) completes the task for which the ownership of the address range 214(H)-214(K) was transferred. In such an instance, the data H′-K′ may be copied back to the host memory 214 as previously described.
The coherency driven enhancements to a PCIe transaction layer according to aspects disclosed herein may be provided in or integrated into any processor-based device. Examples, without limitation, include a set top box, an entertainment unit, a navigation device, a communications device, a fixed location data unit, a mobile location data unit, a mobile phone, a cellular phone, a smart phone, a tablet, a computer, a portable computer, a desktop computer, a personal digital assistant (PDA), a monitor, a computer monitor, a television, a tuner, a radio, a satellite radio, a music player, a digital music player, a portable music player, a digital video player, a video player, a digital video disc (DVD) player, an automobile, and a portable digital video player.
In this regard,
Other devices can be connected to the system bus 704. As illustrated in
While not illustrated in
Those of skill in the art will further appreciate that the various illustrative logical blocks, modules, circuits, and algorithms described in connection with the aspects disclosed herein may be implemented as electronic hardware, instructions stored in memory or in another computer-readable medium and executed by a processor or other processing device, or combinations of both. The slave devices described herein may be employed in any circuit, hardware component, integrated circuit (IC), or IC chip, as examples. Memory disclosed herein may be any type and size of memory and may be configured to store any type of information desired. To clearly illustrate this interchangeability, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. How such functionality is implemented depends upon the particular application, design choices, and/or design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The various illustrative logical blocks, modules, and circuits described in connection with the aspects disclosed herein may be implemented or performed with a processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
The aspects disclosed herein may be embodied in hardware and in instructions that are stored in hardware, and may reside, for example, in Random Access Memory (RAM), flash memory, Read Only Memory (ROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a CD-ROM, or any other form of computer readable medium known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a remote station. In the alternative, the processor and the storage medium may reside as discrete components in a remote station, base station, or server.
It is also noted that the operational steps described in any of the exemplary aspects herein are described to provide examples and discussion. The operations described may be performed in numerous different sequences other than the illustrated sequences. Furthermore, operations described in a single operational step may actually be performed in a number of different steps. Additionally, one or more operational steps discussed in the exemplary aspects may be combined. It is to be understood that the operational steps illustrated in the flowchart diagrams may be subject to numerous different modifications as will be readily apparent to one of skill in the art. Those of skill in the art will also understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The present application claims priority under 35 U.S.C. §119(e) to U.S. Provisional Patent Application Ser. No. 62/182,815 filed on Jun. 22, 2015 and entitled “COHERENCY DRIVEN ENHANCEMENTS TO A PERIPHERAL COMPONENT INTERCONNECT (PCI) EXPRESS (PCIe) TRANSACTION LAYER,” the contents of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62182815 | Jun 2015 | US |