Embodiments relate to communication via multi-drop bus structures.
Many different types of known buses and other interfaces are used to connect different components using a wide variety of interconnection topologies. For example, on-chip buses are used to couple different on-chip components of a given integrated circuit (IC) such as a processor, system on a chip or so forth. External buses can be used to couple different components of a given computing system either by way of interconnect traces on a circuit board such as a motherboard, wires and so forth.
A recent multi-drop interface technology is an Improved Inter Integrated Circuit (I3C) Specification-based bus, expected to become available from the Mobile Industry Processor Interface (MIPI) Alliance™ (www.mipi.org). This interface is expected to be used to connect devices, such as internal or external sensors or so forth, to a host processor, applications processor or standalone device via a host controller or input/output controller.
In various embodiments, techniques are provided to realize greater flexibility in various systems for enabling greater types of devices to be coupled via interconnects of potentially longer length. More specifically, embodiments provide techniques for performing link training to identify particular propagation delays inherent in communications between devices coupled to the interconnect. In this way, delay information obtained during such link training may be used to dynamically control clocking for bus communication operations, including read and write operations. As will be described further herein, in some cases a bus master may be configured to perform dynamic delay determination operations and appropriate configuring of read and write communications based at least in part on such delay information.
Referring now to
As illustrated, a primary or main master device 20 couples to bus 15. In various embodiments, master device 20 may be implemented as a host controller that includes hardware logic to act as a bus master for bus 15. Master device 20 may include a controller (not shown in the high level view of
In different implementations, master device 20 may be an interface circuit of a multicore processor or other system on chip (SoC), application processor or so forth. In other cases, master device 20 may be a standalone host controller (such as a given integrated circuit (IC)) or main master device for bus 15. And of course other implementations are possible. In other cases, master device 20 may be implemented as hardware, software, and/or firmware or combinations thereof, such as dedicated hardware logic, e.g., a programmable logic, to perform bus master activities for bus 15.
Note that bus 15 is implemented as a two-wire bus in which a single serial line forms a data interconnect and another single serial line forms a clock interconnect. As such, data communications can occur, e.g., in bidirectional manner and clock communication can occur in a single direction. Master device 20 may be a relatively compute complex device (as compared to other devices on bus 15) that consumes higher power than other devices coupled to bus 15.
As shown in
As further illustrated in
During read operations on bus 15, a timing window available for performing a read is nearly half of the bus period (Tperiod) and can be defined as: Tbusavail=Tscomaster Tpd15+Tscoslave+Tskew-jitter+Tsetuptime<=D*Tperiod [Eq. 1], where Tscomaster is the bus master propagation delay time which in an example may be approximately 3 nanoseconds (ns), Tscoslave is the slave device response time (e.g., a 10 ns maximum per a given specification or more in some proprietary devices), Tpd15 is a signal return path from the main master to the slave 30N or 40N and back to the main master, Tsetuptime is a setup time (which in an example may be approximately 3 ns) for the master, D is a duty cycle factor which could be range from 0.4-0.6 depending on duty cycle requirement, and Tskew-jitter is a time allocated for skew and jitter (e.g., 3 ns). In an example without an embodiment, this leaves nearly an approximate 13 ns turn-around time margin or 6.5 ns time for a signal to travel from master 20 to slave 40N or 30N. Due to impedance mismatches (and also the location of slave 40 from master 20), signal reflection may cause additional time loss, in turn leaving only, e.g., 2 ns-4 ns time of flight margin from master 20 to slave 40N or 30N.
Due to read window bus available limitation times without an embodiment, many system platform topologies provide a specification limit as to a long reach platform solution. For example, a circuit board trace (FR4) may be limited to be 15-20 inches and a standard cable length limited to 0.3 meters (m)-0.5 meters depending on cable type by a given system specification. Many types of computing systems such as client, Internet of Things (IoT) and automotive applications may have longer board traces of more than 20 inches and also cable length longer than, e.g., 1 m to 5 m or more. Additionally, some proprietary slaves may have longer delays than specified in a given specification, which may also limit the slave device selection choice. Using an embodiment, a system designer is afforded the flexibility to use long reach solutions for board traces or cables (e.g., for automotive and IoT segments) without limiting the bus operating frequency.
Embodiments provide techniques to optimize bus speed for bus 15. To this end, bus master 20 may perform, e.g., during boot or otherwise, link training with each device 30, 40 to determine propagation delay information of cable/trace and slave device responses via bus 15. This delay information may thereafter be used to program one or more of a read clock and a write clock with a programmable configuration. Such adjusted clocks may be used whenever bus master 20 performs transactions with each device 30, 40.
Referring now to
At the high level illustrated in
Device 140B may be powered when it is to be active. As an example, assume that device 140B is another type of sensor, such as a camera device. In such example, device 140B may be powered on only when a camera functionality of the system is active. In other cases device 140B may be a slave device that can be physically added/removed via a hot plug or hot unplug operation, such as a cable, card, or external peripheral device that is coupled to bus 130, e.g., by a cable, external connection or so forth. In still other cases, device 140B may be coupled via an in-box cable. In such cases, there may be a long distance between device 140B and host controller 110. Note that device 140B may be relatively further away from host controller 110 than device 140A.
As illustrated in
Host controller 110 further includes a clock generator 115 to provide a clock signal (and/or to receive a clock signal, in implementations for certain buses) to a clock line of bus 130 via corresponding driver 116. In various embodiments, clock generator 115 may be configured to provide additional clock signals for use in host controller 110, as described herein.
To perform the link training described herein, host controller 110 includes a link training circuit 120. Link training circuit 120 is configured to cause a dedicated link training command to be sent via driver 113 to a respective device 140. This link training command, when received within a given device 140 may cause the corresponding device 140 to cooperate in a link training process by immediately sending an acknowledgement command signal (ACK) back to host controller 110 in an expedited manner. As illustrated, device 140A includes a slave link control circuit 145 that, responsive to this link training command, decodes the link training command and sends immediately an acknowledgement command via driver 144 over link 130. In this way, link training circuit 120 of host controller 110 can readily identify total propagation delay information for communications between host controller 110 and device 140A, as described herein.
More specifically link training circuit 120 sends this link training command via driver 113 along the data line of interconnect 130 and to a receiver 146 that in turn provides the received link training command signal to slave link control circuit 145 for decoding and further response. In response to this signal, slave link control circuit 145 immediately sends the acknowledgement command signal via driver 144. As illustrated, this incoming signal is received within host controller 110 via a receiver 114 that provides this acknowledgement command signal back to link training circuit 120 for decoding the received ACK.
Based at least in part on the time duration from the sending of the link training command from link training circuit 120 to receipt of the acknowledgement command signal in link training circuit 120, appropriate propagation delay parameters for communications with respect to device 140A may be determined. Link training circuit 120 sends configuration information, namely configuration read information (e.g., a number of configuration bits), to read circuit 122 to adjust internal read clock timing. Similarly, link training circuit 120 sends configuration information, namely the same delay estimation as configuration write information, to write control circuit 124, which can use this information to hold the next write data. Thus as seen, link training circuit 120 sends configuration information to write control circuit 124 and read control circuit 122, to control adjustment of internal timing. In an embodiment, timing adjustment could be done using flip-flop based delay generation, or in another way. As will be described further herein to enable link training circuit 120 to perform link training operations with respect to devices 140, clock generator 115 may provide a system clock signal to link training circuit 120, which may run at a much faster rate than the SCL clock.
Referring now to
As illustrated, method 200 is a process for link training that initiates at block 210. First a given device is selected (n=0) (block 215). Thereafter, the host sends a link training command to the given device coupled to the host via a bus (block 220). Note that this command in an embodiment is a particular link training command. In association with sending the link training command, the host controller may initiate an internal timer to measure delay using a system clock (block 230). Note that this system clock rate may be at a much higher rate than the SCL bus clock frequency. In this way, link training can account for small variations in propagation delay for different devices coupled to the bus. In a particular embodiment where a bus clock operates at 12.5 MHz, this high frequency clock may be at many times higher than the bus clock rate. Of course other examples are possible.
Still with reference to
Still with reference to
Still with reference to
Referring now to
Still with reference to
Referring now to
As further illustrated in
Embodiments may be implemented in a wide variety of interconnect structures. Referring to
System memory 610 includes any memory device, such as random access memory (RAM), non-volatile (NV) memory, or other memory accessible by devices in system 600. System memory 610 is coupled to controller hub 615 through memory interface 616. Examples of a memory interface include a double-data rate (DDR) memory interface, a dual-channel DDR memory interface, and a dynamic RAM (DRAM) memory interface.
In one embodiment, controller hub 615 is a root hub, root complex, or root controller in a PCIe interconnection hierarchy. Examples of controller hub 615 include a chip set, a memory controller hub (MCH), a northbridge, an interconnect controller hub (ICH), a southbridge, and a root controller/hub. Often the term chip set refers to two physically separate controller hubs, i.e. a memory controller hub (MCH) coupled to an interconnect controller hub (ICH). Note that current systems often include the MCH integrated with processor 605, while controller 615 is to communicate with I/O devices, in a similar manner as described below. In some embodiments, peer-to-peer routing is optionally supported through root complex 615.
Here, controller hub 615 is coupled to switch/bridge 620 through serial link 619. Input/output modules 617 and 621, which may also be referred to as interfaces/ports 617 and 621, include/implement a layered protocol stack to provide communication between controller hub 615 and switch 620. In one embodiment, multiple devices are capable of being coupled to switch 620.
Switch/bridge 620 routes packets/messages from device 625 upstream, i.e., up a hierarchy towards a root complex, to controller hub 615 and downstream, i.e., down a hierarchy away from a root controller, from processor 605 or system memory 610 to device 625. Switch 620, in one embodiment, is referred to as a logical assembly of multiple virtual PCI-to-PCI bridge devices. Device 625 includes any internal or external device or component to be coupled to an electronic system, such as an I/O device, a Network Interface Controller (NIC), an add-in card, an audio processor, a network processor, a hard-drive, a storage device, a CD/DVD ROM, a monitor, a printer, a mouse, a keyboard, a router, a portable storage device, a Firewire device, a Universal Serial Bus (USB) device, a scanner, and other input/output devices and which may be coupled via an I3C bus, as an example. Often in the PCIe vernacular, such a device is referred to as an endpoint. Although not specifically shown, device 625 may include a PCIe to PCI/PCI-X bridge to support legacy or other version PCI devices. Endpoint devices in PCIe are often classified as legacy, PCIe, or root complex integrated endpoints.
Graphics accelerator 630 is also coupled to controller hub 615 through serial link 632. In one embodiment, graphics accelerator 630 is coupled to an MCH, which is coupled to an ICH. Switch 620, and accordingly I/O device 625, is then coupled to the ICH. I/O modules 631 and 618 are also to implement a layered protocol stack to communicate between graphics accelerator 630 and controller hub 615. A graphics controller or the graphics accelerator 630 itself may be integrated in processor 605.
Turning next to
Interconnect 712 provides communication channels to the other components, such as a Subscriber Identity Module (SIM) 730 to interface with a SIM card, a boot ROM 735 to hold boot code for execution by cores 706 and 707 to initialize and boot SoC 700, a SDRAM controller 740 to interface with external memory (e.g., DRAM 760), a flash controller 745 to interface with non-volatile memory (e.g., flash 765), a peripheral controller 750 (e.g., an eSPI interface) to interface with peripherals, video codecs 720 and video interface 725 to display and receive input (e.g., touch enabled input), GPU 715 to perform graphics related computations, etc. Any of these interconnects/interfaces may incorporate aspects described herein, including delay determinations and clock control. In addition, the system illustrates peripherals for communication, such as a Bluetooth module 770, 3G modem 775, GPS 780, and WiFi 785. Also included in the system is a power controller 755.
Referring now to
Still referring to
Furthermore, chipset 890 includes an interface 892 to couple chipset 890 with a high performance graphics engine 838, by a P-P interconnect 839. As shown in
The following examples pertain to further embodiments.
In one example, an apparatus comprises a host controller to couple to an interconnect to which a plurality of devices may be coupled. The host controller may include: a first driver to drive first information onto the interconnect; a first receiver to receive second information from at least one of the plurality of devices via the interconnect; a read controller to adjust timing of a read clock based on a propagation delay timer value associated with a first device of the plurality of devices and communicate information on the interconnect with the first device using the adjusted read clock timing; and a link training controller to initiate a link training with the first device to determine the propagation delay timer value.
In an example, the host controller comprises a table having a plurality of entries each to store a propagation delay timer value for one of the plurality of devices.
In an example, the link training controller is to dynamically determine the propagation delay timer value for the first device.
In an example, the link training controller is to send a link training command to the first device to cause the first device to send an acknowledgement command to the host controller.
In an example, the link training controller is to determine the propagation delay timer value based on a time duration between a first time that the host controller is to send the link training command and a second time that the host controller is to receive the acknowledgement command.
In an example, the apparatus further comprises a timer to count the time duration according to a second clock, the second clock having a frequency substantially greater than the read clock.
In an example, the link training controller is to send a second propagation delay timer value associated with a second device of the plurality of devices, and the read controller is to adjust the timing of the read clock based on the second propagation delay timer value and communicate second information on the interconnect with the second device using the adjusted read clock timing, where the second propagation delay timer value is different than the propagation delay timer value.
In an example, the read controller comprises a first programmable circuit to adjust the read clock using the propagation delay timer value.
In an example, the host controller further comprises a write controller having a second programmable circuit to adjust a write clock to be provided to a write circuit of the host controller according to the propagation delay timer value, to maintain an inter-packet gap between a cycle of the write clock and a cycle of the read clock.
In an example, the link training controller is to dynamically determine a unique propagation delay timer value for the plurality of devices after the plurality of devices have been enumerated with address information.
In another example, a method comprises: identifying, via a host controller, a slave device having control of a bus that couples the slave device and the host controller; accessing a timer storage based on the identified slave device to obtain a timer value associated with the slave device; adjusting a timing of a read clock based on the timer value; and reading data received in the host controller from the slave device according to the adjusted timing of the read clock.
In an example, the method further comprises dynamically determining the timer value for the slave device based on a training.
In an example, the training comprises sending a first command from the host controller to the slave device to cause the slave device to send an acknowledgement command to the host controller.
In an example, the method further comprises determining the timer value based on a time duration between a first time at which the host controller sends the first command and a second time at which the host controller receives the acknowledgment command.
In an example, the method further comprises measuring the time duration according to a clock signal having a higher clock rate than the read clock.
In an example, the method further comprises communicating with a second slave device coupled to the host controller via the bus according to a second timer value, the second timer value different than the timer value, where the second slave device is located at a second distance with respect to the host controller, the first device located at a first distance with respect to the host controller, the first distance less than the second distance, the timer value less than the second timer value.
In another example, a computer readable medium including instructions is to perform the method of any of the above examples.
In another example, a computer readable medium including data is to be used by at least one machine to fabricate at least one integrated circuit to perform the method of any one of the above examples.
In another example, an apparatus comprises means for performing the method of any one of the above examples.
In a further example, a system comprises: a first device coupled to a host controller via a bus, where the first device is a first distance from the host controller; a second device coupled to the host controller via the bus, where the second device is a second distance from the host controller, the second distance greater than the first distance; and the host controller having a read controller to read data communicated from the first device with a read clock, a timing of the read clock adjusted according to a timer value associated with the first device.
In an example, the host controller comprises a table and a link training controller, the link timing controller to obtain the timer value from the table.
In an example, the link training controller is to send a first command to the first device to cause the first device to send an acknowledgement command to the host controller, and determine the timer value based on a time duration between a first time that the host controller is to send the first command and a second time that the host controller is to receive the acknowledgement command.
In an example, the system further comprises a timer to measure the time duration according to a second clock, the second clock having a higher clock rate than the read clock.
In yet another example, an apparatus comprises: means for accessing a storage means associated with a slave device having control of a bus to obtain a timer value associated with the slave device; means for adjusting a timing of a read clock based on the timer value; and means for reading data communicated from the slave device via the bus according to the adjusted timing of the read clock.
In an example, the apparatus further comprises means for dynamically determining the timer value for the slave device based on a training.
In an example, the apparatus further comprises means for sending a first command to the slave device to cause the slave device to send an acknowledgement command.
In an example, the apparatus further comprises means for determining the timer value based on a time duration between a first time at which the first command is sent and a second time at which the acknowledgment command is received.
In an example, the apparatus further comprises means for measuring the time duration according to a clock signal having a higher clock rate than the read clock.
Understand that various combinations of the above examples are possible.
Note that the terms “circuit” and “circuitry” are used interchangeably herein. As used herein, these terms and the term “logic” are used to refer to alone or in any combination, analog circuitry, digital circuitry, hard wired circuitry, programmable circuitry, processor circuitry, microcontroller circuitry, hardware logic circuitry, state machine circuitry and/or any other type of physical hardware component. Embodiments may be used in many different types of systems. For example, in one embodiment a communication device can be arranged to perform the various methods and techniques described herein. Of course, the scope of the present invention is not limited to a communication device, and instead other embodiments can be directed to other types of apparatus for processing instructions, or one or more machine readable media including instructions that in response to being executed on a computing device, cause the device to carry out one or more of the methods and techniques described herein.
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. Embodiments also may be implemented in data and may be stored on a non-transitory storage medium, which if used by at least one machine, causes the at least one machine to fabricate at least one integrated circuit to perform one or more operations. Still further embodiments may be implemented in a computer readable storage medium including information that, when manufactured into a SoC or other processor, is to configure the SoC or other processor to perform one or more operations. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.