This application is a translation of and claims the priority benefit of Italian Patent Application No. 102023000002871, filed on Feb. 20, 2023, entitled “Processing system, related integrated circuit, device and method,” which is hereby incorporated herein by reference to the maximum extent allowable by law.
Embodiments of the present disclosure relate to processing systems, such as multi-core microcontrollers, comprising one or more network ports, and associated integrated circuits, devices and methods.
For example, in
In this respect, future generation of such processing systems 10, e.g., micro-controllers adapted to be used in automotive applications, are expected to exhibit an increase in complexity, mainly due to the increasing number of requested functionalities (new protocols, new features, etc.) and to the tight constraints of execution conditions (e.g., lower power consumption, increased calculation power and speed, etc.). For example, recently more complex multi-core processing systems 10 have been proposed. For example, such multi-core processing systems may be used to execute (in parallel) several of the processing systems 10 shown in
For example, as shown at the example of the processing core 1021, each processing core 102 may comprise a microprocessor 1020 and a communication interface 1022 configured to manage the communication between the microprocessor 1020 and the communication system 114. Typically, the interface 1022 is a master interface configured to forward a given (read or write) request from the microprocessor 1020 to the communication system 114, and forward an optional response from the communication system 114 to the microprocessor 1020. However, the communication interface 1022 may also comprise a slave interface. For example, in this way, a first microprocessor 1020 may send a request to a second microprocessor 1020 (via the communication interface 1022 of the first microprocessor, the communication system 114 and the communication interface 1022 of the second microprocessor). Generally, each processing core 1021 . . . 102n may also comprise further local resources, such as one or more local memories 1026, usually identified as Tightly Coupled Memory (TCM).
Typically, the processing cores 102 are arranged to exchange data with one or more non-volatile memories 104 and/or one or more volatile memories 104b. Generally, the memories 104 and/or 104b may be integrated with the processing cores 102 in a single integrated circuit, or the memories 104 and/or 104b may be in the form of a separate integrated circuit and connected to the processing cores 102, e.g., via the traces of a printed circuit board.
Specifically, in a multi-core processing system 10 these memories are often system memories, i.e., shared for the processing cores 1021 . . . 102n. For example, for this purpose, the communication with the memories 104 and/or 104b may be performed via one or more memory controllers 100 connected to the communication system 114. As mentioned before, each processing cores 102 may, however, comprise one or more additional local memories 1026.
For example, the software executed by the microprocessor(s) 1020 is usually stored in a non-volatile (program) memory 104, such as a Flash memory or EEPROM, i.e., the memory 104 is configured to store the firmware of the processing unit 102, wherein the firmware includes the software instructions to be executed by the microprocessor 102. Generally, the non-volatile memory 104 may also be used to store other data, such as configuration data, e.g., calibration data. Conversely, a volatile memory 104b, such as a Random-Access-Memory (RAM), may be used to store temporary data.
Often, the processing system 10 comprises also one or more (hardware) resources/peripherals 106, e.g., selected from the group of:
The resources 106 are usually connected to the communication system 114 via a respective communication interface 1062, such as a peripheral bridge. For example, for this purpose, the communication system 114 may indeed comprise an Advanced Microcontroller Bus Architecture (AMBA) High-performance Bus (AHB), and an Advanced Peripheral Bus (APB) used to connect the resources/peripherals 106 to the AMBA AHB bus. In general, the communication interface 1062 comprises at least a slave interface. For example, in this way, a processing core 102 may send a request to a resource 106 and the resource returns given data. Generally, one or more of the communication interfaces 1062 may also comprise a respective master interface. For example, such a master interface, often identified as integrated Direct Memory Access (DMA) controller, may be useful in case the resource has to start a communication in order to exchange data via (read and/or write) request with another circuit connected to the communication system 114, such as a resource 106 or a processing core 102.
Often such processing systems 10 comprise also one or more general-purpose DMA controllers 110. For example, as shown in
As mentioned before, automotive technology is rapidly developing and new complex automotive applications often have significant bandwidth requirements for in-vehicle connections. For example, usually the communications between the ECUs of a vehicle use one or more Controller Area Network buses. Recently, it has been proposed to use for these communications (at least in part) Ethernet communication channels, e.g., by using the Internet Protocol (IP), e.g., the Transmission Control Protocol (TCP) also identified as TCP/IP.
For example, as shown in
In the example considered, one or more of these processing systems, such as the processing systems 102 and 103, may be connected to further processing systems, such as processing systems 104, 105, 106 and 107, via one or more further communication system 202 and 203. In general, these further communication systems may be based on Ethernet and/or other communication systems. For example, the communication system 202 may be based on Ethernet, while the communication system 203 may be a CAN bus. Accordingly, the processing system 102 may be configured as a router configured to forward data between the Ethernet networks 201 and 202. Conversely, the processing system 103 may be configured as a gateway configured to exchange data between the Ethernet network 201 and the CAN bus 203, also implementing a protocol conversion. For example, the communication channel 203 may be a low-speed communication channel used for a front, rear and/or interior lighting system. Accordingly, the various processing systems 10 of the vehicle shown in
As shown in
Accordingly, in the scenario shown in
Typically, the further communication interface of the processing system 108 is a mobile communication interface, such as a 2G, 3G, 4G or 5G communication interface, such as an LTE (Long-Term Evolution) transceiver, and/or a wireless communication interface according to one of the versions of the IEEE 802.11 standard.
In this respect, as mentioned before, a plurality of the ECUs may also be implemented with the same multicore processing systems 10. For example, the ECUs 104 and 105 may indeed be implemented with the same physical multicore processing system, e.g., by using one or more dedicated processing cores 102 for each ECU or by using a virtualization with a hypervisor. Accordingly, in this case, the processing system may comprise for each ECU (104 and 105) a respective dedicated physical Ethernet communication interface, or the processing system may comprise a physical Ethernet communication interface connected to the Ethernet network 202 and emulate for each ECU a respective virtual communication interface. Generally, a virtual communication interface may be emulated at the Ethernet frame level, in particular by emulating a plurality of Media Access Control (MAC) addresses with the same physical Ethernet communication interface, or at the higher protocol layers, e.g., by assigning a plurality of IP addresses to the same Ethernet communication interface.
In general, as represented in
In the context of an Ethernet communication, the IP packet is then included in an Ethernet frame comprising an Ethernet header E_H and the IP packet as payload E_D. For example, the Ethernet header E_H comprises a destination MAC address and a source MAC address. Typically, the Ethernet frame comprises also additional Cyclic Redundancy Check (CRC) data.
Specifically, in the example considered, the hosts H1 and H2 are connected to a first Ethernet network 201 having assigned a first IP address range, e.g., 192.168.0.0/24. For this purpose, the host H1 comprises an Ethernet communication interface IF1H1 having a MAC address MAC1, wherein a first IP address IP1 in the first IP address range is associated with the communication interface IF1H1, such as 192.168.0.2, and the host H2 comprises an Ethernet communication interface IF1H2 having a MAC address MAC2, wherein a second IP address IP2 in the first IP address range is associated with the communication interface IF1H2, such as 192.168.0.1.
Similarly, the hosts H2 and H3 are connected to a second Ethernet network 202 having assigned a second IP address range, e.g., 192.168.1.0/24. For this purpose, the host H2 comprises a further Ethernet communication interface IF2H2 having a MAC address MAC3, wherein a third IP address IP3 in the second IP address range is associated with the communication interface IF2H2, such as 192.168.1.1, and the host H3 comprises an Ethernet communication interface IF1H3 having a MAC address MAC4, wherein a fourth IP address IP4 in the second IP address range is associated with the communication interface IF1H3, such as 192.168.1.2.
Accordingly, in order to send a data packet from the host H1 to the host H2, wherein the IP packet header IP_H comprises as source address the IP address IP1 (e.g., 192.168.0.2) and as destination address the IP address IP2 (e.g., 192.168.0.1), the host H1 has to generate an Ethernet frame comprising (in addition to the IP packet) as source MAC address the address MAC1 and as destination MAC address the address MAC2, wherein the host H1 transmits this Ethernet frame via the communication interface IF1H1 to the network 201, which in turn forwards the Ethernet frame to the interface IF1H2 of the host H2.
Conversely, in order to send a data packet from the host H1 to the host H3, wherein the IP packet header IP_H comprises as source address the IP address IP1 (e.g., 192.168.0.2) and as destination address the IP address IP4 (e.g., 192.168.1.2), indeed two communications are required.
Specifically, first the host H1 has to determine that the host H3 may be reached via the host H2. Then the host H1 has to generate an Ethernet frame comprising (in addition to the IP packet) as source MAC address the address MAC1 and as destination MAC address the address MAC2, wherein the host H1 transmits this Ethernet frame via the communication interface IF1H1 to the network 201, which in turn forwards the Ethernet frame to the interface IF1H2 of the host H2.
The host H2 then processes the received IP packet included in the Ethernet frame in order to determine that the included IP packet should be transmitted to an IP address not managed by the host H2, wherein the respective host is reachable via the interface IF2H2. Accordingly, at this point, the host H2 has to generate an Ethernet frame comprising (in addition to the IP packet) as source MAC address the address MAC3 and as destination MAC address the address MAC4, wherein the host H2 transmits this Ethernet frame via the communication interface IF2H2 to the network 202, which in turn forward the Ethernet frame to the interface IF1H3 of the host H3.
Accordingly, in order to correctly forward IP packets, each host H1, H2 and H3 has to manage a routing table, which often comprises the following data for each route:
For example, the host H1 may have associated a routing table RT1 indicating that the subnetwork assigned to the network 201 may be reached directly via the interface IF1H1, while the subnetwork assigned to the network 202 may be reached by using as gateway/next-hop the IP address IP2. The Ethernet communication interface IF1H1 may then determine the MAC address associated with the next-hop, i.e., the MAC address MAC2, and generate the respective Ethernet frame.
Conversely, the host H2 may have associated a routing table RT1 indicating that the subnetwork assigned to the network 201 may be reached directly via the interface IF1H2, and the subnetwork assigned to the network 202 may be reached directly via the interface IF2H2.
Accordingly, in a scenario as shown in
In view of the above, various embodiments of the present disclosure provide solutions for integrated network accelerators, in particular in the context of the organization and management of IP routing tables.
According to one or more embodiments, the above objective is achieved by means of a processing system having the features specifically set forth in the claims that follow. Embodiments moreover concern a related integrated circuit, device and method.
The claims are an integral part of the technical teaching of the disclosure provided herein.
As mentioned before, various embodiments of the present disclosure relate to a processing system comprising a hardware network accelerator. In various embodiments, the hardware network accelerator comprises a plurality of Ethernet communication interfaces.
In various embodiments, the hardware network accelerator comprises moreover a plurality of memories and a further memory. Specifically, each of the plurality of memories is configured to store a plurality of records, wherein each record comprises enable data indicating whether the respective record contains valid data and destination IP data identifying a destination IP address range. Conversely, the further memory is configured to store a plurality of further records, wherein each record is associated univocally with a respective further record, and wherein each further record comprises next-hop data indicating a next-hop IP address, next-hop enable data indicating whether a respective destination IP address range may be reached directly or via the respective next-hop IP address, and network port data indicating one of the Ethernet communication interfaces.
For example, in various embodiments, each record comprises a first data section comprising a destination IP address included in the respective destination IP address range, and a first control data section comprising the enable data and a field for storing a subnet or network mask for the respective destination IP address range. Conversely, each further record may comprise a second data section comprising the next-hop IP address, and a second control data section comprising the next-hop enable data and the network port data.
In various embodiments, each Ethernet communication interface is configured to obtain an IP packet comprising a destination IP address. Next, the Ethernet communication interface accesses in parallel the plurality of memories in order to sequentially read at least in part the records stored to each of the plurality of memories, and compares the destination IP address with the destination IP data of the read records containing valid data in order to select a record having a destination IP address range containing the destination IP address.
For example, in order to access in parallel the plurality of memories, each Ethernet communication interface may comprise a search engine comprising a plurality of search circuits, wherein each search circuit is configured to access a respective memory of the plurality of memories in order to sequentially read at least in part the records stored to the respective memory of the plurality of memories, and compare the destination IP address with the destination IP data of the respective read records containing valid data in order to determine whether the record has a destination IP address range containing the destination IP address. For example, each search circuit may be configured to sequentially read the records stored to the respective memory of the plurality of memories until the enable data indicate that the respective record does not contain valid data or at least one of the search circuits determines that a record has a destination IP address range containing the destination IP address.
Once a record is selected, the Ethernet communication interface reads the further record associated with the selected record from the further memory, and selects one of the Ethernet communication interfaces based on the network port data of the read further record.
In various embodiments, in order to manage plural communications, the hardware network accelerator may comprise a memory controller configured to decide, which of the Ethernet communication interfaces may access the plurality of memories and the further memory.
Accordingly, in various embodiments, the selected Ethernet communication interface may determine a target IP address by selecting the destination IP address when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached directly, or the next-hop IP address of the read further record when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached via the respective next-hop IP address. Next, the selected Ethernet communication interface may determine a target Media Access Control (MAC) address for the target IP address, generate an Ethernet frame comprising an Ethernet header and as payload the IP packet, wherein the Ethernet header comprises as destination MAC address the target MAC address and a source MAC address configured for the respective Ethernet communication interface. Finally, the selected Ethernet communication interface may transmit the Ethernet frame to an Ethernet network connected to the selected Ethernet communication interface.
In various embodiments, the processing system comprises also a microprocessor and a communication system connecting the microprocessor to the hardware network accelerator, wherein the microprocessor is adapted to program the records in the plurality of memories and the further records in the further memory.
For this purpose, the hardware network accelerator may comprise an address translation and dispatcher circuit configured to manage an address map, wherein each memory location in the plurality of memories and each memory location in the further memory is associated with a respective memory address in the address map. Accordingly, the address translation and dispatcher circuit may receive a memory write request comprising a memory address in the address map and respective data to be stored, select one of the plurality of memories and the further memory based on the received memory address, select a memory location of the selected memory based on the received memory address, and store the data to be stored to the selected memory location of the selected memory.
For example, in various embodiments, the memory address comprises a first field, a second field and a third field, wherein the first field indicates whether the received data should be stored to the plurality of memories or the further memory. When the first field indicates that the received data should be stored to the plurality of memories, the second field indicates a record in the plurality of memories, and the third field indicates whether the received data should be stored to the first data section or the first control data section of the record indicated by the second field. Conversely, when the first field indicates that the received data should be stored to the further memory, the second field indicates a further record in the further memory, and the third field indicates whether the received data should be stored to the second data section or the second control data section of the further record indicated by the second field.
In various embodiments, at least one of the first data section, the first control data section, the second data section and the second control data section may be stored to a plurality of memory slots in the respective memory. In this case, the respective memory address may comprise a fourth field indicating a respective memory slot of the respective plurality of memory slots.
In various embodiments, each memory location of the plurality of memories and the further memory may comprise a plurality of bytes. In this case, the memory address may comprise a fifth field indicating a sub-set of the bytes of the memory slot indicated by the first field, the second field, the third field, and optionally the fourth field.
In various embodiments, the hardware network accelerator may communicate with the microprocessor via a slave interface and/or (indirectly) via a DMA transfer. For example, when the communication system comprises a slave communication interface associated with the hardware network accelerator, the slave communication interface may be configured to receive a write request from the microprocessor, wherein the write request comprising an address in a physical address sub-range associated with the slave communication interface and respective transmitted data. Next, the slave interface may extract a memory address in the address map of the address translation and dispatcher circuit from the address in the physical address sub-range, and provide a memory write request to the address translation and dispatcher circuit, wherein the memory write request comprises the extracted memory address and the transmitted data.
Accordingly, in various embodiments, the microprocessor may program a plurality of records in the plurality of memories and respective further records in the further memory. Next, an IP packet may be provided to one of the Ethernet communication interface, whereby the Ethernet communication interface uses the plurality of records in the plurality of memories and the respective further records in the further memory in order to select one of the Ethernet communication interfaces, and the selected Ethernet communication interface generates an Ethernet frame and transmits the Ethernet frame to an Ethernet network connected to the selected Ethernet communication interface.
Embodiments of the present disclosure will now be described with reference to the annexed drawings, which are provided purely by way of non-limiting example and in which:
In the following description, numerous specific details are given to provide a thorough understanding of embodiments. The embodiments can be practiced without one or several specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the embodiments.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
The headings provided herein are for convenience only and do not interpret the scope or meaning of the embodiments.
In the following
As mentioned before, various embodiments of the present disclosure provide solutions for integrated network accelerators, in particular in the context of the organization and management of IP routing tables.
Specifically, such a network accelerator 4 may be used as a resource 106 in the processing system 10 shown in
Specifically, in the embodiment considered, the network accelerator 4 comprises at least one communication interface 40 often also indicated as network port. For example, in
In various embodiments, the communication interface 40 obtains an IP packet comprising an IP header IP_H, which in turn includes a destination IP address (see
For example, in various embodiments, the communication interface 40 is configured to receive data from the communication system 114 (not shown in
Conversely, in order to receive the IP packet from the network 20 to which the communication interface 40 is connected, the communication interface 40 may be configured to receive an Ethernet frame and extract the IP packet from the payload E_D of the Ethernet frame.
In various embodiments, an IP packet obtained by a communication interface 40 may be destined to (and should thus be routed to):
Accordingly, in various embodiments, the network accelerator 4 may also be configured to implement an IP router. In this case, the generation of the IP packet within the processing system 10 and the forwarding of IP packets to the communication system 114 may be purely optional.
In various embodiments, the network accelerator 4, e.g., each communication interface, may also implement firewall functions, such as a destination IP address and/or (e.g., TCP and/or UDP) destination port filtering, and/or a source IP address and/or (e.g., TCP and/or UDP) source port filtering.
As described in the foregoing, in order to forward IP packets, the processing system 10 uses an IP routing table. For example, the IP routing table could be managed in software, wherein an IP packet received by a communication interface 40 from a respective network, is provided to a processing core 102, which then determines via software instructions whether the IP packet should be processed internally or forwarded to another communication 40.
Conversely, in various embodiments, the network accelerator 4 is configured to manage the routing of the IP packet directly in hardware. Specifically, in various embodiments, the network accelerator 4 comprises a routing table search engine 400 configured to analyses a routing table stored to a memory 44. In general, the search engine 400 may be common for the complete network accelerator 4 or (as shown in
Accordingly, a processing core 102 may configure the IP routing table by writing the content of the memory 44. For example, for this purpose, the memory controller 42 may have associated a communication interface 46 connected to the communication system 114 of the processing system 10. For example, the communication interface 46 may be a slave interface of the communication system 114, such as a peripheral bridge. In general, the communication interface 46 may also be included in the memory controller 42.
Accordingly, once having obtained (i.e., received or generated) an IP packet, a communication interface 40 may access the memory 44 and read the IP routing table in order to decide how to forward the IP packet (to a network 20, another communication interface 40 or to a further circuit within the processing system 10, such as a processing core 102).
In various embodiments, in order to improve the velocity of the read operation of the IP routing table, the memory 44 may indeed be implemented with a plurality of NB memories 441 to 44NB, such as a plurality of NB RAM banks. Accordingly, in this case, each search engine 400 may indeed comprise NB search circuits 4001 to 400NB, wherein each search circuit is configured to access a respective memory 441 to 44NB.
Accordingly, once having processed the routing table, the search engine 400 may decide whether the IP packet should be forwarded to the respective network 20, another communication interface 40 or to a circuit 102/106 within the processing system 10. For example, in order to provide the data to a processing core 102, a resource 106 or another communication interface 40, the communication interface 40 may comprise an integrated DMA interface or may use a general-purpose DMA controller 110 configured to store the IP packet or the respective payload IP_D via a DMA transfer to the memory 104b or one or more dedicated RAM memories within the network accelerator 4. Conversely, in order to forward the IP packet to the network 20, the communication interface 40 may generate an Ethernet frame comprising the IP packet as payload E_D and an Ethernet header E_H comprising its own MAC address as source address and the MAC address of the next-hop as target address. In various embodiments, the communication interface 40 may also be configured to manage a plurality of virtual Ethernet interfaces, e.g., by configuring a plurality of source MAC addresses.
Specifically, in the embodiment considered, the memory 44 comprises a given number N of routing table entry slots RTE1 to RTEN, wherein each slot RTE may store a respective routing table entry. For example, in case the memory 44 comprises NB memory banks 441 to 44NB, each memory bank may store N/NB slots. For example, the first memory bank 441 may be arranged to store the slots RTE1 to RTEN/NB.
Specifically, in the embodiment considered, each routing table entry RTE comprises the following fields:
For example, in various embodiments, the control data CNTRL comprise an enable field EN indicating that the respective routing table entry RTE contains valid data. Moreover, in various embodiments, the control data CNTRL comprise a field SUBNET for indicating a subnet or netmask for the destination IP address, which thus permits to specify (together with the destination IP address field IP_DA) a destination IP address range. In various embodiments, the control data CNTRL comprise also a field NH_EN, such as a next-hop address enable bit, indicating whether:
Accordingly, the control data CNTRL may indicate whether to use the IP address of the IP packet or the next-hop IP address field IP_NH as target for the Ethernet communication, which is then mapped by the communication interface 40 to a respective MAC address to be inserted in the Ethernet frame. Moreover, in case of a plurality of (physical or virtual) communication interfaces, a field DST_PORT of the control data CNTRL may specify a (physical or virtual) communication interface 40 to be used to transmit the IP packet.
In general, the number of physical memory slots/data words NW occupied by each routing table entry RTE may depend on the length of the IP address (i.e., whether IPv4 or IPv6 addresses are supported) and the number of bits of each memory slot.
Accordingly, during a search process, a search engine 400 inside each communication interface 40 and having a parallelism equal to the number NB of memory banks, may accesses in parallel all NB memory banks 441 to 44NB, read sequentially the stored entries RTE inside each memory bank and compare the specified destination IP addresses range (as indicated by the destination IP address IP DA and the control data CNTRL) with the destination IP address of the IP packet. In case of a match, the search process may be stopped and the control data CNTRL may be used to determine the target of the communication, e.g., for determining a communication interface 40 to be used to transmit the IP packet, and the IP address of the next target (i.e., the final destination or a gateway node indicated in the next-hop IP address field IP_NH).
Accordingly, if required, the communication interface 40 may forward the IP packet to the indicated communication interface 40. The indicated (virtual or physical) communication interface may then determine the destination MAC address associated with the IP address of the next target, generate the respective Ethernet frame (by adding the destination MAC address and its own MAC address as source MAC address) and transmit the Ethernet frame to the connected network 20. For example, as well known in the art, the communication interface 40 may manage for this purpose a table of devices connected to the respective network 20, wherein this table comprises the MAC addresses of the devices and the respective IP addresses. For example, such a table may be obtained via the Address Resolution Protocol (ARP) and is usually called ARP cache.
The inventors have observed that the previously described organization of the memory 44 has several disadvantages. In fact, in order to implement the search, it is sufficient that a search engine 400 is able to just determine the destination IP address range (as specified by the data IP_DA and CNTRL) of the routing table entries RTE having stored valid data (as specified by the data CNTRL). Accordingly, in order to implement a parallel search function, parallel access to the complete routing table entries RTE (in particular the data IP_NH) is not required, but a parallel access to the destination IP address IP_DA and part of the control data CNTRL is sufficient.
Accordingly,
Specifically, in the embodiment considered, the data of each routing data entry slots RTE1 to RTEN adapted to be stored to the memory 44 are now organized into two parts:
Accordingly, in line with the foregoing, the data of a given routing data entry slot RTEA may comprise:
For example, as shown in
Conversely, the data of a given routing data entry slot RTEB may comprise:
For example, as shown in
In the embodiment considered, the first part of routing data RTEA may thus be stored to respective routing data slots RTEA1 to RTEAN in a memory 48 and the second part of routing data RTEB may be stored to respective routing data slots RTEB1 to RTEBN in a memory 50.
Specifically, as shown in
In various embodiments, instead of filling sequentially the slots of data RTEA1 to RTEAN in the various memories 481 to 48NB, the data are stored in an interleaved manner to the memories 481 to 48NB, i.e., the first data RTEA1 associated with a first routing data entry RTE1 are stored to the first slot of the first memory 481, the second data RTEA2 associated with a second routing data entry RTE1 are stored to the first slot of the second memory 482, . . . , and the last data RTEAN associated with a last routing data entry RTEN are stored to the last slot of the last memory 48NB.
Conversely, since a parallel access to the routing data RTEB may not be required, the data RTEB may be stored to a single memory 50. Accordingly, even though the memory 50 could indeed be implemented with a plurality of physical RAM memory banks, these memory banks are connected to the same memory interface of the memory controller 42.
Accordingly, in the embodiment considered, each search circuit 4001 to 400NB of a given communication interface 40 may access (via the memory controller 42) a respective memory 481 to 48NB, and sequentially read the respective entries RTEA with an enable field EN indicating that the respective routing table entry RTEA contains valid data. In parallel the search engine 400 may determine whether the destination IP address of the IP packet is included in the destination address range indicated by the record RTEA.
Accordingly, once having found and selected a given record RTEA, the respective search circuit 4001 to 400NB or another circuit of the search engine 400 also knows the index of the selected record RTEA, and may use this index in order to read the respective data RTEB from the memory 50. In general, the data RTEB may be stored in any suitable manner to the memory 50, which still permits a univocal mapping of a given selected record RTEA to the respective record RTEB. For example:
Accordingly, in the embodiment considered, the memory 48 stores up to N records RTEA indicating the supported destination IP addresses IP_DA, the respective associated subnet mask SUBNET, and the entry enable field EN of the routing table entry RTE. This memory 48 may be implemented with a selectable level NB of parallelism. Preferably, an interleaved scheme is used to store the various records RTEA. Accordingly, the search engines 400 of the various communication interfaces 401 to 40NP may have a parallelism equal to the number NB of memory banks 481 to 48NB, and may sequentially read the content of the memory 48, which is shared among all the communication interfaces 40. As soon as a match is found, the search process may be stopped and the index of the matching entry may be used to access the respective record RTEB in the memory 50 of the same routing table entry RTE, wherein the record RTEB has stored the respective next-hop address IP_NH and the respective additional control data CNTRL. Accordingly, also the memory 50 is shared among all communication interfaces 40, and may be implemented using a single memory bank.
Generally, a routing table could also include a plurality of routes which could apply to a given destination IP address. Accordingly, in various embodiments, the search circuits 4001 to 400NB could return all matching records RTEA, and the search engine 400 could select a best matching route, e.g., by using a metric or cost stored with the control data CNTRL of the record RTEB. Conversely, as indicated in the foregoing, in various embodiments, the first matching route is selected. In this case, the records RTE (and accordingly the respective records RTEA and RTEB) should be ordered accordingly. For example, in typical routing methods, the longest matching mask is selected, i.e., the smallest matching destination address range IP_DA. Accordingly, in this case, the routing table entries RTE should be already stored to the memories 48 and 50 in the requested order. For example, such a re-ordering of the routing table may be managed by a processing core 102 of the processing system 10, which may write the memories 48 and 50.
Accordingly, in various embodiments, the content of memories 48 and 50 is programmable via a processing core 102 of the processing system 10. In general, any suitable address mapping may be used to map a given sub-range of the physical address range of the communication system 114 to the physical address range of the memories 481 to 48NB and 50. In general, it is not required that these mapping indeed reflects the order of the storage locations within the memories 481 to 48NB and 50.
For example,
For example, in the embodiment shown in
Similarly, in the embodiment shown in
Accordingly, in the embodiment considered, the memory 48 may store data IP_DA0 to IP_DA(N-1) for the N destination IP addresses, and data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL for the respective control data, i.e., the memory 48 has at least 8·N bytes of space, which may be divided into NB memory banks 481 to 48NB. Similarly, the memory 50 may store data IP_NH0 to IP_NH(N-1) for the N next-hop addresses, and data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL for the respective control data, i.e., also the memory 50 has at least 8·N bytes of space and may be implemented with a single memory bank. Accordingly, when using IPv4 addresses with a 32-bit memory, a total of four memory slots are occupied by the data of each routing table entry RTE (two memory slots in the memory 48 for the data RTEA and two memory slots in the memory 50 for the data RTEB).
For example, the memory controller 42 may perform an address translation operation, wherein the data IP_DA0 start at a given start address and the following addresses are mapped to the data IP_DA1 to IP_DA(N-1). Following addresses may then be mapped sequentially to the data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL, the data IP_NH0 to IP_NH(N-1) and the data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL. Accordingly, in the embodiment considered:
In general, a given group of data may follow immediately the previous group of data or the address map may have a gap between the respective address subranges. In various embodiments, the groups of data may also have a different order.
Conversely, in the embodiment shown in
Similarly, in the embodiment shown in
Accordingly, in the embodiment considered, the memory 48 may store data IP_DA_W0 to IP_DA_W3 for each of the N destination IP addresses IP_DA0 to IP_DA(N-1), and data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL for the respective control data, i.e., the memory 48 has at least 20·N bytes of space, wherein the respective records RTE may be stored into NB memory banks 481 to 48NB. Similarly, the memory 50 may store data IP_NH_W0 to IP_NH_W3 for each of the N next-hop addresses IP_NH0 to IP_NH(N-1), and data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL for the respective control data, i.e., also the memory 50 has at least 20·N bytes of space and may be implemented with a single memory bank. Accordingly, when supporting IPv6 addresses with a 32-bit memory, a total of ten memory slots are occupied by the data of each routing table entry RTE (five memory slots in the memory 48 for the data RTEA and five memory slots in the memory 50 for the data RTEB).
For example, the memory controller 42 may perform an address translation operation, wherein the data IP_DA0_W0 of the first destination IP address IP_DA0 start at a given start address and the following addresses are mapped to the other data words IP_DA0_W1 to IP_DA0_W3 of the first destination IP address IP_DA0, and similarly to the data words IP_DA_W0 to IP_DA_W3 of the following destination IP address IP_DA1 to IP_DA(N-1). The following addresses are then mapped sequentially to the data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL, the data words IP_NH_W0 to IP_NH_W3 of the next-hop addresses IP_NH0 to IP_NH(N-1), e.g., data words IP_NH0_W0 and the data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL. Thus, also in this case, the address range is organized in four groups for the data IP_DA, IP_DA_CNTRL, IP_NH and IP_NH_CNTRL.
In both cases, the control data IP_DA_CTRL may comprise, in addition to the entry enable/valid bit EN, a field SUBNET for specifying the subnet mask of the address. The length of this field may be application dependent. For example, the inventors have observed that for typical applications the field SUBNET may have:
For example, in case of IPv4 addresses, the respective value stored to the field SUBNET may indicate an integer value corresponding to the number of Most Significant Bits (MSB) of the address IP_DA which are held valid, and the remaining Least Significant Bits (LSB) of the address IP_DA may be masked for determining the respective destination IP address range of the routing table entry. For example, the binary values “011000” of the field SUBNET may indicate the subnet “/24”, which corresponds to a netmask of “255.255.255.0”.
Conversely, the control data IP_NH_CTRL may comprise, in addition to the next-hop enable field NH_EN, a field DST_PORT for specifying a communication interface/network port 40 to be used to transmit the IP packet. The length of this field depends on the number of supported (physical or virtual) communication interfaces 40. For example, in various embodiments, the field DST_PORT may have 6 bits, which permits to support up to 64 (physical or virtual) communication interfaces 40.
Accordingly, in the embodiments considered, the memory controller 42 may be configured to implement an address translation and provide an address range, which is addressable by one or more of the processing cores 102 of the processing system 10. Those of skill in the art will appreciate that the address map of the memory controller 42 may also be organized differently. For example, the data IP_DA, IP_DA_CTRL, IP_NH and IP_NH_CTRL of a given routing table entry RTE may have consecutive addresses.
Accordingly, the processing core 102 may use the address map provided by the memory controller 42 in order to access via software instructions the memory slots of the memories 48 (in particular the respective memory banks 481 to 48NB) and 50. As mentioned before, indeed the various memory slots described with respect to
Accordingly, a processing core 102 and/or another processing system 10 may be configured to implement a routing algorithm, such as Open Shortest Path First (OSPF) or Border Gateway Protocol (BGP), thereby defining the content of the routing table entries RTE and the respective order. Next, the processing core 102 and/or the other processing system 10 may determine the respective records RTEA and RTEB and store the respective data to the memories 48 (memories 481 to 48NB) and 50 by using the address map provided by the memory controller 42.
Specifically, as will be described in greater detail in the following, in various embodiments, the processing core 102 (or another master circuit connected to the communication system 114) may read or write the content of the memories 48 and 50 by sending a read or write request to the communication system 114, wherein the request comprises an address of a sub-range managed by the communication interface 46. Thus, the address map of the physical address range of the communication system 114 may also be different from the address map provided by the address translation circuit. For example, in various embodiments, the address map of the translation circuit starts at 0, while the slave interface 46 may have associated an address range starting at a given start address/offset and having a dimension corresponding to the dimension of the address map provided by the circuit. Accordingly, in this case, the slave interface 46 may receive a request and generate the address ADDR provided to the address translation circuit by removing the start address/offset from the address received with the request.
In addition to or as alternative to a slave interface, the interface 46 may also comprise an integrated DMA controller, whereby the processing core 102 may program the routing table by storing the respective data to the memory 104b and the integrated DMA controller may automatically transfer the routing table from the memory 104b to the address translation circuit 424.
Specifically, as described in the foregoing, in various embodiments, each communication interface 40 comprises a respective search engine 400, which comprises a plurality of NB search circuits 4001 to 400NB. Accordingly, each communication interface 40/search engine 400 generates NB sets of control signals for performing in parallel read operations from the NB memory banks 481 to 48NB. Specifically, the first sets of control signals generated by the first search circuits 4001 of the communication interfaces 401 to 40NP are provided to a first arbiter 4201, which selects a search circuit 4001 permitted to access the first memory bank 481. Similarly, the second sets of control signals generated by the second search circuits 4002 of the communication interfaces 401 to 40NP are provided to a second arbiter 4202, which selects a search circuit 4002 permitted to access the first memory bank 482. The same connections are also applied for the other control signals, e.g., the last sets of control signals generated by the last search circuits 400NB of the communication interfaces 401 to 40NP are provided to a last arbiter 420NB, which selects a search circuit 400NB permitted to access the last memory bank 48NP.
Similarly, in various embodiments, each communication interface 40/search engine 400 generates a further set of control signals for performing a read operation from the memory bank 50. In general, these further control signals may be generated directly by each search circuit 4001 to 400NB of a given search engine 400 or a shared circuit of the search engine 400. Accordingly, the further sets of control signals generated by the communication interfaces 401 to 40NP are provided to a further arbiter 422, which selects a communication interface 401 to 40NP permitted to access the memory bank 50. For example, the arbiters 4201 to 420NP and 422 may implement a round-robin arbitration.
As described in the foregoing, in various embodiments, the memory controller 42 manages also the access of the communication interface 46 (connected to the communication system 114) to the memory banks 481 to 48NB and 50. In this respect, as described with respect to
In various embodiments, instead of providing the signals of the dispatcher circuit 424 to the arbiters 4201 to 420NB and 422, the memories 48 and 50 are implemented with dual port memories, and the arbiters 4201 to 420NB and 422 (and thus the search circuits 400) may access a first memory port, and the dispatcher circuit 424 may access a second memory port of the respective memory.
Specifically, while the communication interfaces 401 to 402 are preferably configured to perform only read requests (and not write requests) to the memory banks 481 to 48NB and 50, the communication interface 46 is configured to perform (via the address translation circuit 424) at least write request (and preferably also read requests) to the memory banks 481 to 48NB and 50. In this respect, in various embodiments, the communication interface 46 is a slave interface connected to the communication system 114 of the processing system 10. Accordingly, such a slave interface 46 may receive read or write request from any master interface connected to the communication system 114, such as a processing core 102 or a DMA controller 110. In general, the slave interface 46, the various master interfaces and/or the communication system 114 may also implement an access protection in order to limit read or write access to the addresses in the address map provided by the circuit 424 (and accordingly to the memory banks 481 to 48NB and 50).
Accordingly, in the embodiment considered, the routing table memory controller 42, is used to arbitrate the access requests to the shared routing table stored to the memory banks 481 to 48NB and 50, that can concurrently arrive from the communication interfaces 401 to 40NP and from another circuit (external to the network accelerator 4) via the communication/programming interface 46 (e.g., an AXI slave). The search engines 4001 to 400NB inside each communication interface/network port 401 to 40NP preferably can generate only read access requests during the lookup process, while the programming interface 46 may generate both read and write requests to the memory banks 481 to 48NB and 50.
Specifically, in the embodiment considered, each memory bank 481 to 48NB and 50 has associated a respective arbitration circuit 4201 to 420NB and 422, which are configured to arbiters the incoming access requests, e.g., by using a round-robin scheduling algorithm.
Concerning the interface 46, an address translation and dispatcher circuit 424 is configured to manage an address map (see
In various embodiments, the address translation operation (circuit 424) is only implemented for the access via the interface 46, while the search circuits 4001 to 400NB access the memories by using directly the addresses of the respective memory slots. In fact, each search circuit 4001 to 400NB should only access a respective memory bank 481 to 48NB.
In the following will now be described in greater detail possible embodiments of the operations of the slave interface 46 and the address translation and dispatcher circuit 424.
Specifically,
Specifically, in the embodiment considered, a given number n of LSB bits of the address ADR correspond to the address ADDR to be provided to the dispatcher circuit 424. Accordingly, the remaining MSB bits may be used to identify the interface 46 within the communication system 114.
In the embodiment considered, a first bit MEMT of the address ADDR, preferably the MSB bit of the address ADDR, indicates the memory type, i.e., whether to access the memory 48 (e.g., MEMT=“0”) or the memory 50 (e.g., MEMT=“1”). Moreover, a second bit DC indicates whether to access the data section (e.g., DC =“0”), i.e., the slots IP_DA or IP_NH, or the control section (e.g., DC=“1”), i.e., the slots IP_DA_CNTRL or IP_NH_CNTRL, of the respective memory. Finally, a field SLOT #indicates the respective slot number. Accordingly, when N slots have to be supported, the field SLOT #has at least n=log2(N) bits, e.g., bits SLOT #[n-1:0]
Specifically, as shown in
Conversely, as shown in
In various embodiments, the address field ADDR may also comprise a byte index B #. For example, this byte index B #may be useful in order to permit just a programming of single bytes in the memories 48 and 50. For example, when the memories 48 and 50 have 32 bits, the field B #may have two bits and indicate one of the 4 bytes of a respective memory slot indicated by the other address data ADDR. In general, the byte index B #may be used when the communication system 114 has a data width being smaller than the data width of the memories 48 and 50, or in order to ensure that only single bytes may be programmed.
Thus, in various embodiments, the (word) index W #may be used to select a given word of the selected record slot #, and the byte index B #may be used to select a given byte in the selected memory slot as indicated by the data SLOT #and W #.
Specifically, once the address translation circuit 424 receives at a step 4000 a (read or write) request comprising a given address ADDR in the address map of the dispatcher circuit 424, the address translation circuit 424 determines whether the received address ADDR is associated with the memory 48 or the memory 50.
For example, in the embodiments shown in
Optionally, the address translation circuit 424 may compare the received address ADDR with a lower threshold corresponding to the address associated with the first slot in the memory 48 (e.g., corresponding to the address of the memory slot IP_DA0_W0) and/or with an upper threshold corresponding to the address associated with the last slot in the memory 50 (e.g., corresponding to the address of the memory slot IP_NH(N-1)_CNTRL).
Conversely, in the embodiments shown in
Accordingly, in case the address ADDR is associated with the memory 48 (as schematically shown via an output “0” of the step 4002), the dispatcher circuit 424 proceeds to a verification step 4004.
Specifically, the step 4004 is purely optional and is used in case different address translation operations are required for the destination IP addresses IP_DA and the control data IP_DA_CNTRL. In fact, in the embodiment considered, the address translation circuit 424 determines whether the received address ADDR is associated with a destination IP address IP_DA or control data IP_DA_CNTRL. For example, in the embodiments shown in
Conversely, in the embodiments shown in
Accordingly, in case the address ADDR is associated with data IP_DA (as
schematically shown via an output “0” of the step 4004), the address translation circuit 424 proceeds to a step 4008, where the address translation circuit 424 applies the address translation operation in order to access the data IP_DA in the memory 48.
For example, as described in the foregoing, the field SLOT #may indicate a given record slot, and optionally the index W #may indicate a respective data word of the record. Specifically, by storing the records RTEA in an interleaved manner to the memories 481to 48NB, a given number nb=log2(NB) of LSB bits of the field SLOT #may be used to select one of the memory banks 481 to 48NB, and the remaining bits of the field SLOT #may be used to select a respective record slot in the selected memory bank 481to 48NB. Specifically, as described with respect to
Conversely, in case the address ADDR is associated with control data IP_DA_CTRL (as schematically shown via an output “1” of the step 4004), the address translation circuit 424 proceeds to a step 4010, where the address translation circuit 424 applies the address translation operation in order to access the control data IP_DA_CTRL in the memory 48.
For example, as described in the foregoing, the field SLOT #may indicate a given record. Specifically, by storing the slots in an interleaved manner to the memories 481 to 48NB, a given number nb=log2(NB) of LSB bits of the field SLOT #may again be used to select one of the memory banks 481 to 48NB, and the remaining bits of the field SLOT #may be used to select a respective record slot in the selected memory bank 481 to 48NB. Specifically, as described in the foregoing, preferably the control data IP_DA_CTRL use only a single memory slot. Accordingly, both for IPv4 and IPv6, each memory slot may already correspond to the respective record slot of control data IP_DA_CNTRL, i.e., the field W #may be omitted for the control data IP_DA_CTRL.
Conversely, in case the address ADDR is associated with the memory 50 (as schematically shown via an output “1” of the step 4002), the address translation circuit 424 proceeds to a verification step 4006.
Specifically, also the step 4006 is purely optional and is used in case different address translation operations are required for the next-hop addresses IP_NH and the control data IP_NH_CNTRL. In fact, in the embodiment considered, the address translation circuit 424 determines whether the received address ADDR is associated with a next-hop addresses IP_NH or control data IP_NH_CNTRL. For example, in the embodiments shown in
Again, in the embodiments shown in
Accordingly, in case the address ADDR is associated with data IP_NH (as schematically shown via an output “0” of the step 4006), the address translation circuit 424 proceeds to a step 4012, where the address translation circuit 424 applies the address translation operation in order to access the data IP_NH in the memory 50.
For example, as described in the foregoing, the field SLOT #may indicate a given record, and optionally the field W #may indicate a given word in the record. Specifically, by storing the slots sequentially to the memory 50, the field SLOT #may be used to select a respective record slot in the memory 50. Specifically, in case of IPv4 addresses and 32-bit memories, each memory slot again corresponds to the respective record slot. Conversely, as described with respect to
Conversely, in case the address ADDR is associated with control data IP_NH_CTRL (as schematically shown via an output “1” of the step 4006), the address translation circuit 424 proceeds to a step 4014, where the address translation circuit 424 applies the address translation operation in order to access the control data IP_NH_CTRL in the memory 50.
For example, as described in the foregoing, the field SLOT #may indicate a given record. Specifically, by storing the slots sequentially to the memory 50 the field SLOT #may be used to select a respective record slot in the memory 50. Specifically, as described in the foregoing, preferably the control data IP_DA_CTRL use only a single memory slot. Accordingly, both for IPv4 and IPv6, each memory slot may already correspond to the respective record slot of control data IP_NH_CNTRL.
As described in the foregoing, in various embodiments, the dispatcher circuit may also support the byte index B #. For example, this byte index may be used to select a given bytes of a selected memory slot, as indicated by the field SLOT #and optionally W #.
In various embodiments, the slots of the memories 48 and 50 may also have less bits than the data width of the communication system 114. In this case, the dispatcher 424 may be configured to automatically generate a plurality of read or write requests. For example, such multiple read or write requests may access automatically consecutive memory locations in the same memory. Moreover, in case of the memory 48, such multiple read or write requests may also access in parallel respective memory banks 481 to 48NB.
In various embodiments, the control data IP_DA_CNTRL and/or IP_NH_CNTRL may also comprise further data. For example, in various embodiments, the control data IP_DA_CNTRL and/or IP_NH_CNTRL comprise additional Error Correction Code (ECC) data. For example, the ECC data included in the data IP_DA_CNTRL may be calculated according to a given ECC scheme (at least) based on the data IP_DA and SUBNET. Similarly, the ECC data included in the data IP_NH_CNTRL may be calculated according to a given ECC scheme (at least) based on the data IP_NH, NH_EN and DST_PORT. For example, in various embodiments is used a Single-Error-Correction and Double-Error-Detection (SECDED) code.
For example, such an arrangement is particularly useful, e.g., in case of ASIL-B compliant processing systems, because the memory controller 42 and/or the search engines 400 may verify the correctness of the data stored to the memories 48 and 50 without any intervention of a processing core 102. For example, in case of a fault is detected (such as an uncorrectable error, such as a double-bit error), the memory controller 42 and/or the respective search engine 400 may signal the error to a fault collection and error management circuit of the processing system 10, which may signal the error to the one or more processing cores 102 and/or to a pad or pin of the integrated circuit of the processing system 10.
Accordingly, in the solutions described in the foregoing, the organization of the routing table data permits a fast lookup of a destination IP address within an IP routing table, without relying on slow software-based lookups or complex and expansive Ternary Content Addressable Memories (TCAM). For this purpose, the proposed solutions may perform a parallel search by accessing in parallel a plurality of memory banks 481 to 48NB having stored the principal data of the search operation. In fact, in various embodiments, the routing table data are split and stored to two different memories 48 and 50, whose content is optimized for the routing operations executed in HW. This separation permits that the memory 48 contains (only) the data useful for detecting a match of the destination IP address in the specified destination IP ranges, while the other data specifying the respective route (e.g., next-hop, egress interface index, etc.) may be stored in a single memory bank 50, whose content is retrieved only after a match is found.
Accordingly, the disclosed solutions permit a fast lookup process that does not significantly delay the packet transmission, thereby achieving high data rates and low latencies.
Of course, without prejudice to the principle of the invention, the details of construction and the embodiments may vary widely with respect to what has been described and illustrated herein purely by way of example, without thereby departing from the scope of the present invention, as defined by the ensuing claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 102023000002871 | Feb 2023 | IT | national |