The present disclosure relates to the operation of a modulator, demodulator, or modem to process a data stream using multiple processors with a distributed memory architecture, and more particularly to a dual, bi-directional interconnect ring bus.
Computing devices often contain modems to enable communications with other computing devices. Modems are typically configured to perform both transmitter and receiver operations and as such may be used in two-way communications devices, such as mobile phones. Because a modem may operate on data which occupies various spectra throughout its processing chain, it is common for a modem to be implemented as a chipset rather than on a single chip. For example, frequency translation and radio frequency/intermediate frequency (RF/IF) processing may be done on one chip (or die), which is then coupled to a second chip (or die) performing baseband functions, such as modulation/demodulation and encoding/decoding.
Baseband processing may be implemented in a variety of ways, such as through use of dedicated logic, processors, or combinations thereof. For example, it is not unusual for modern modems to contain up to 30 processors to implement baseband processing in a distributed memory architecture connected with a system bus. A system bus may comprise multiple busses, such as a control bus, address bus, and data bus, and may connect components in a variety of ways, such as ad hoc, with a cross-bar type bus, mesh, point-to-point protocol, or in a ring. Data congestion on a bus can vary depending on how components are connected on the bus, how data is routed, and data arbitration scheme employed. For example, a centralized arbitration scheme operating on all components connected to a bus may induce undesirable latency that adds to data congestion on the bus. Furthermore, each processor may have fast access to its own local memory, but may also be required to access memory of another processor, which may exacerbate data congestion. In addition, some bus architectures are not amenable to scalability in number of processors, and often need to be redesigned and laid out to accommodate additional processors that may be added to a modem to support additional modem features.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter.
In some aspects, a device for processing signals comprises a plurality of nodes. Each node among the plurality of nodes has an address, and each of the addresses is different. The device also comprises a plurality of processors. Each of the plurality of processors is uniquely assigned to a node among the plurality of nodes. The device also comprises a dual interconnect bus that is comprised of a first ring bus and a second ring bus that connect the plurality of nodes in a ring. The first ring bus and the second ring bus may be configured for different data structures. The dual interconnect bus is configured to route data on at least one of the first ring bus or the second ring bus to at least one node among the plurality of nodes according to an address assigned to the at least one node. The data is processed by a processor among the plurality of processors assigned to the at least one node.
In other aspects, a method for routing data on a bus couples a plurality of nodes. Each node among the plurality of nodes is coupled to a first neighboring node in a first direction and a second neighboring node in a second direction to form a ring bus comprising at least two interconnect rings. A plurality of processors is assigned to the plurality of nodes. A first processor among the plurality of processors is configured to process a first data type, and a second processor among the plurality of processors is configured to process a second data type. Data on the ring bus is separated into the first data type and the second data type. At least part of the separated data of the first data type is routed on one interconnect ring to the first processor and at least part of the separated data of the second data type is routed on another interconnect ring to the second processor.
In yet other aspects, an apparatus for processing signals comprises a plurality of nodes. Each node among the plurality of nodes has an address where each of the addresses is different. The apparatus also comprises a plurality of processors and a ring bus having at least two interconnect rings. The apparatus also comprises means for, based on the addresses, assigning the plurality of processors to the plurality of nodes. A first processor among the plurality of processors is configured to process a first data structure, and a second processor among the plurality of processors is configured to process a second data structure. The apparatus also comprises means for, based on the addresses, coupling the plurality of nodes with the ring bus. Each node among the plurality of nodes is coupled to a first neighboring node in a first direction and a second neighboring node in a second direction. The apparatus also comprises means for, based on the first data structure and the second data structure, separating data on the ring. The apparatus also comprises means for, based on the separated data, routing at least part of the separated data on one interconnect ring to the first processor and at least another part of the separated data on another interconnect ring to the second processor.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and does not purport to be limiting in any way. Other aspects, inventive features, and advantages of the devices and/or processes described herein, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth herein.
The detailed description references the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different instances in the description and the figures may indicate similar or identical items.
Modems often implement transmitters and receivers using processors and other signal-processing circuits. This disclosure describes a bi-directional, dual interconnect bus configured in a ring to route data to processors and signal-processing circuits implementing modem functions. Data may be separated according to data type and routed on a ring suitable to the data type. Arbitration may be done locally at nodes connecting the processors and the other signal-processing circuits to the interconnect ring bus. As such, the dual interconnect ring bus is highly scalable, being able to support any number of nodes and processors, yet with distributed memory support. The dual interconnect ring bus is also highly configurable as it is able to support various layout configurations. It can also enable low congestion, because in the ring structure each node is connected to its adjacent nodes, which reduces routing congestion. The dual interconnect ring bus may also be quick time-to-market, since various tiers of modems can be supported without redesign.
In the following discussion, an example modem, techniques that elements of the example modem may implement, and a system-on-chip on which elements of the example modem may be employed, are described. Consequently, performance of the example procedures is not limited to the example modem and the example modem is not limited to performance of the example procedures. Any reference made with respect to the example modem, or elements thereof, is by way of example only and is not intended to limit any of the aspects described herein.
In some embodiments, modem 100 performs frequency translation, encoding/decoding, and/or modulation/demodulation to process data sent over a communication link between a user device, such as a cell phone, and a cell tower/towers. Frequency translation, encoding/decoding, and/or modulation/demodulation may be in accordance with a signal protocol, such as a 3rd Generation Partnership Project (3GPP) protocol, a Long Term Evolution (LTE) protocol, and so forth. Modem 100 may be configurable to process signals in accordance with a first signal protocol when modem 100 is in a first configuration, and process signals in accordance with a second signal protocol when modem 100 is in a second configuration. For example, data processed by modem 100 can include signals comprising a first signal that complies with a first regulatory standard and a second signal that complies with a second regulatory standard, such as a first signal complying with a cellular phone standard in a first modem configuration, and a second signal complying with a Wi-Fi standard in a second modem configuration.
Analog RF circuitry 102 sends and receives data over a wireless communication link via one or more antenna, such as antenna 114. Antenna 114 may comprise a single antenna or a plurality of antennas. Alternately or additionally, analog RF circuitry 102 sends and receives data with baseband circuitry 104, such as over a data line. Among other things, analog RF circuitry 102 receives RF data, translates the data to baseband (or near baseband), such as part of a demodulation process, and forwards the baseband data to baseband circuitry 104. Analog RF circuitry 102 can also receive baseband data from baseband circuitry 104, translate the baseband data to RF, such as part of a modulation process, and transmit the modulated data via antenna 114. Frequency translation includes an up-conversion or down-conversion and may be done in a single conversion, or a plurality of conversion steps. For example, translation from an RF signal to a baseband signal may or may not include a translation to an intermediate frequency (IF). Analog RF circuitry 102 may also perform filtering, gain control, DC removal, and/or other compensations. Furthermore, it is to be understood that though modem 100 is illustrated in
Baseband circuitry 104 implements real-time baseband processing, such as transmitter functions and/or receiver functions, including mapping/de-mapping, cyclic prefix insertion/removal, encoding/decoding, inverse transforms/transforms, and the like. In some embodiments, baseband circuitry 104 includes dedicated hardware logic gates to perform various signal processing and/or real-time signal processing, and can be dynamically programmed using register settings. Baseband circuitry 104 may include a processor and be coupled to bus 106 as a way to communicate with and/or access host processor 108, baseband processors 110-1-110-N, and/or memories 112-1-112-N.
Host processor 108 provides command and control signals to various blocks contained within modem 100, such as analog RF circuitry 102, baseband circuitry 104, and/or baseband processors 110-1-110-N over bus 106. Host processor 108 can be any suitable type of processor, and have any suitable type of configuration. At times, host processor 108 includes a CODEC, video processor, media processor, address manager, and the like.
Baseband processors 110-1-110-N represent programmable processors configured to execute code to perform functions, such as frequency translation, data encoding, data decoding, data modulation, data demodulation, and so forth. Baseband processors 110-1-110-N can be any suitable type of processor, such as scalar processors, vector processors, or a combination thereof. Generally speaking, scalar processors utilize a low-bandwidth, narrow data-width bus for interrupts, data and message passing, while vector processors utilize a high-bandwidth, wide data-width bus to move large amounts of computation data. A processor may be configured to process both scalar data and vector data, or only scalar data or vector data. Furthermore, baseband processors 110-1-110-N are each coupled to, or have, respective memories 112-1-112-N. Thus, memories 112-1-112-N store code to be executed by respective baseband processors 110-1-110-N. Memories 112-1-112-N may comprise cache, flash, DRAM, SRAM, volatile and/or non-volatile memory, and/or any other type of suitable memory, such as computer-readable storage media (CRM) including any suitable type of data storage media, such as optical media (e.g., disc), magnetic media (e.g., disk or tape), and the like.
Blocks comprising modem 100, such as analog RF circuitry 102, baseband circuitry 104, baseband processors 110-1-110-N, memories 112-1-112-N, and host processor 108, may each be assigned an address so they may be identifiable on bus 106. Furthermore, a baseband processor may read/write from/to a memory coupled to the baseband processor, as well as a memory coupled to bus 106. For example, baseband processor 110-1 may read/write from/to any one of memories 112-1-112-N. Bus 106 may comprise multiple busses, such as a control bus, address bus, and data bus, and may connect components in a variety of ways, such as ad hoc, with a cross-bar type bus, mesh, point-to-point protocol, or in a ring. Data congestion on bus 106 can vary depending on how components are connected on the bus, how data is routed, and data arbitration scheme employed.
Having described an example modem device in which various embodiments can be utilized, consider now a discussion of implementing a modem using a dual interconnect ring bus in accordance with one or more embodiments.
Bus 206 contains a plurality of interconnect ring busses connecting nodes 215-1-215-M in a ring to send and receive data transactions between different processors. Bus 206 may implement bus 106 in
Interconnect ring busses 206-1 and 206-2 may be configured to route different data types, data structures, data widths, data rates, data packet lengths, and/or data formats. For example, interconnect ring bus 206-1 may be configured to route data packetized in a first packet structure and communicated at a first data rate, and interconnect ring bus 206-2 may be configured to route data packetized in a second packet structure and communicated at a second data rate. In an embodiment, interconnect ring bus 206-1 is configured to route scalar data, and interconnect ring bus 206-2 is configured to route vector data. Additionally, interconnect ring busses 206-1 and 206-2 may support different data widths and different address widths, one to another. Alternatively, interconnect ring busses 206-1 and 206-2 may support a same data width and a same address width, one to another. In an embodiment, interconnect ring busses 206-1 and 206-2 each include a 24-bit address, 32-bit data bus. That is, each interconnect ring bus can transfer up to 32-bit data at up to a 24-bit address. In an implementation, interconnect ring busses 206-1 and 206-2 are configured to route a same data structure, including data width, data format, data packet structure, and/or data rate.
Data of different data types, data structures, data widths, data rates, data packet lengths, and/or data formats can be separated and placed onto different interconnect ring busses comprising bus 206 based on a determined data type, data structure, data width, data rate, data packet length, and/or data format. By separating data onto different interconnect ring busses, data congestion on bus 206 can be reduced.
Processors 210-1-210-M are connected to interconnect ring busses 206-1 and 206-2 through nodes 215-1-215-M, which are each assigned a unique ID. Each processor among processors 210-1-210-M is shown connected to interconnect ring busses 206-1 and 206-2 through a node among nodes 215-1-215-M, so that there is a one-to-one correspondence between processors and nodes. That is, each processor may be paired with a unique node. This implementation is illustrated in
Furthermore, each node among nodes 215-1-215-M are connected to two neighboring nodes using bus 206.
Multiple transactions may be concurrently on interconnect ring busses 206-1 and 206-2. In some implementations, traffic is routed in a direction on interconnect ring busses 206-1 and/or 206-2 according to a shortest direction. For example, if a first processor requests a transaction with a second processor, a direction is selected according to a minimum number of node hops between a counter-clockwise and clockwise traverse of interconnect ring busses 206-1 and/or 206-2 from the node assigned to the first processor to the node assigned to the second processor. A node may be identified on interconnect ring busses 206-1 and/or 206-2 using unique node ID's assigned to each node.
Each node among nodes 215-1-215-M, such as node 215-k in
Node 215-k also comprises bridges 307-1-307-2. Each bridge among bridges 307-1-307-2 is configured to transfer data on each I/O port among I/O ports 309-1-309-4 using connection mesh 310-1-310-2. For example, connection mesh 310-1-310-2 may include any suitable trace, wire bond, connection, and the like to enable data transfer between bridges 307-1-307-2 and I/O ports 309-1-309-4. Furthermore, bridges 307-1-307-2 enable data transfer to and from processor 210-k in environment 300. Here, “k” may be any integer, such as an integer between 1 and M. For example, processor 210-k may be any processor from among processors 210-1-210-M.
Processor 210-k comprise interconnect ports 312-1-312-2. Interconnect ports 312-1-312-2 enable data transfer to/from memories and/or processor assets that are mapped with unique address ranges and associated with processor 310-k. For example, interconnect ports 312-1-312-2 may enable data transfer to/from any memory from among memories 112-1-112-N in
Interconnect port 312-1 is coupled to bridge 307-1 through interfaces 316-1 and 316-2, and interconnect port 312-2 is coupled to bridge 307-2 through interfaces 314-1 and 314-2. Interfaces 314-1 and 316-1 may comprise “write” channels, and interfaces 314-2 and 316-2 may comprise “read” channels. Each of interfaces 314-1-314-2 and 316-1-316-2 may comprise a single channel or a plurality of channels. For example, interfaces 314-1 and 316-1 may include a single channel for address writing and data requests, or separate channels, one for address writing and one for data requests. In some implementations, at least one processor transfers data to/from a node over an interface that comprises a number of read channels and/or a number of write channels that is different from a number of read channels and/or number of write channels of an interface to transfer data between a node and a processor other than the at least one processor.
Additionally, in some implementations bridges 307-1-307-2 and/or interconnect ports 312-1-312-2 conform to a standard or an open standard bus interconnect protocol, such as AXI, PCI, or 12C. Furthermore, it is to be understood that though
In some implementations, modem 200 is implemented using a plurality of processors that supports a family of stock keeping units (SKU's) by including and/or activating a processor solely when the processor is to be used for one SKU among the family of SKU's. For example, a processor among the plurality of processors may be rendered inoperable so as to cause at least one feature of the device to be disabled. Furthermore, a new SKU's may be added to a family of SKU's, as new processors may be added to modem 200 using a dual, bi-directional interconnect ring bus as described in
To illustrate these concepts, processor 210-3 and node 215-3 in
For example, node 215-3 may be a generic node added together with processor 210-3 to an existing chip supporting a particular SKU to create a new chip for another SKU. The addition of node 215-3 and processor 210-3 to create the new chip may not require reworking existing processors or nodes on the existing chip due to the dual-interconnect ring-bus architecture.
Having described processors and interconnections of the processors comprising a modem device in which various embodiments can be utilized, consider now a discussion of supplying system clocks to the processors connected with a dual interconnect ring bus in accordance with one or more embodiments. The dual interconnect ring bus allows for relaxed clock tree requirements.
Environment 400 may implement modem 200 with an unbalanced clock tree by relaxing a requirement that clock distribution circuits across all processors or clusters of processors be balanced. In
At 505, a plurality of nodes are coupled. For example, the plurality of nodes may be nodes 215-1-215-M in
At 510, a plurality of processors are assigned to the plurality of nodes. For example, the plurality of processors may be processors 210-1-210-M in
At 515, data on the ring bus is separated into the first data type and the second data type. The first and second data types may comprise a data structure, data width, data rate, data packet length, and/or data format. A data structure, data width, data rate, data packet length, and/or data format comprising the first data type may be different than a data structure, data width, data rate, data packet length, and/or data format comprising the second data type, respectively, so that the first data type is different than the second data type. Alternatively, the first data type and the second type may be comprised of a same data structure, data width, data rate, data packet length, and/or data format, so that the first data type is the same as the second data type.
At 520, at least part of the separated first data type is routed on one interconnect ring to the first processor and at least part of the separated second data type is routed on another interconnect ring to the second processor. Data may be routed in a direction determined at least in part from a minimum distance and/or minimum number of node hops around the ring bus. A direction may be determined from among a clockwise direction and a counter-clockwise direction. The one interconnect ring and/or the another interconnect ring are selectable to route data based at least in part on a minimum distance calculation and/or a determination of a minimum number of node hops around the ring bus. The separated first data type may include scalar data, and the separated second data type may include vector data. Arbitration of routed data may be done at the plurality of nodes. Routed data may be routed to a destination, such as a processor, using at least in part a unique ID assigned to a node.
At 605, addresses are assigned to a plurality of nodes. For example, the plurality of nodes may be nodes 215-1-215-M in
At 610, a plurality of processors are assigned to the plurality of nodes. For example, the plurality of processors may be processors 210-1-210-M in
At 615, the plurality of nodes are connected in a ring using a dual interconnect bus comprising a first ring bus and a second ring bus. For example, the first and second ring busses may include interconnect ring busses 206-1 and 206-2 in
System-on-chip 700 may be integrated with a microprocessor, storage media, I/O logic, data interfaces, logic gates, a transmitter, a receiver, circuitry, firmware, software, and/or combinations thereof to provide communicative or processing functionalities. System-on-chip 700 may include a data bus (e.g., cross bar or interconnect fabric) enabling communication between the various components of the system-on-chip. In some aspects, components of system-on-chip 700 may interact via the data bus to implement aspects of data routing on a dual interconnect ring bus.
In this particular example, system-on-chip 700 includes processor cores 702 and memory 704. Memory 704 may include any suitable type of memory, such as volatile memory (e.g., DRAM), non-volatile memory (e.g., flash), cache, and the like. For example, memory 704 may comprise memories 112-1-112-N in
System-on-chip 700 also includes interconnect ring busses 206-1-206-2 and interconnect nodes 215-1-216-M which may be configured as a dual interconnect ring bus as illustrated in
System-on-chip 700 also includes analog RF circuitry 102 and baseband circuitry 104, which may be embodied separately or combined with other components described herein. For example, baseband circuitry 104 may be connected to interconnect ring busses 206-1-206-2 via a node, such as nodes 215-1-215-M, to implement functions of a modem concurrently or in combination with processors comprising processor cores 702. Alternately or additionally, baseband circuitry 104 and the other components can be implemented as hardware, firmware, fixed logic circuitry, or any combination thereof that is implemented in connection with interconnect ring busses 206-1-206-2 and/or other signal processing and control circuits of system-on-chip 700.
In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, functions may be stored on a computer-readable storage medium (CRM). In the context of this disclosure, a computer-readable storage medium may be any available medium that can be accessed by a general-purpose or special-purpose computer that does not include transitory propagating signals or carrier waves. By way of example, and not limitation, such media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, or any other non-transitory medium that can be used to carry or store information that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. The information can include any suitable type of data, such as computer readable instructions, sampled signal values, data structures, program components, or other data. These examples, and any combination of storage media and/or memory devices, are intended to fit within the scope of non-transitory computer-readable media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with a laser. Combinations of the above should also be included within the scope of computer-readable media.
Firmware components include electronic components with programmable memory configured to store executable instructions that direct the electronic component how to operate. In some cases, the executable instructions stored on the electronic component are permanent, while in other cases, the executable instructions can be updated and/or altered. At times, firmware components can be used in combination with hardware components and/or software components.
The term “component”, “module”, and “system” are indented to refer to one or more computer related entities, such as hardware, firmware, software, or any combination thereof, as further described above. At times, a component may refer to a process and/or thread of execution that is defined by processor-executable instructions. Alternately or additionally, a component may refer to various electronic and/or hardware entities.
Certain specific embodiments are described above for instructional purposes, the teachings of this disclosure have general applicability, however, and are not limited to the specific embodiments described above. The bi-directional, dual interconnect ring bus is not limited to use in realizing modems that communicate in accordance with any particular interface standard such as LTE, UMB, or WiMAX, but rather the bi-directional, dual interconnect ring bus has general applicability to other interface standards.
This application claims priority to U.S. Provisional Patent Application No. 62/222,725, filed Sep. 23, 2015, the disclosure of which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
62222725 | Sep 2015 | US |