The present invention relates to network packet processing, and more particularly, to a network packet processing apparatus that uses a memory with a lower access latency to store a partial packet content to improve the packet pre-processing performance of a network processing unit, and an associated network packet processing method.
A network processing unit (NPU) is a processor specially used for network packet processing, and has some features and architecture to accelerate the processing efficiency of network packets. For example, regarding packet forwarding, the NPU can perform packet pre-processing to determine a matched forwarding rule, and then the network chip can refer to the matched forwarding rule to forward a network packet received by a network port through a network port that satisfies the matched forwarding rule. For a conventional embedded device with limited resources, a dynamic random access memory (DRAM) is generally used to store a large number of network packets. Therefore, when the NPU is performing packet pre-processing, it needs to read the packet data from the DRAM. However, the DRAM has a very high access latency. For example, each read operation of the DRAM requires at least 150 nanoseconds. Since the packet pre-processing performance of the NPU is limited by DRAM's high access latency, the overall packet forwarding processing efficiency is degraded due to NPU's limited packet pre-processing performance.
One of the objectives of the claimed invention is to provide a network packet processing apparatus that uses a memory with a lower access latency to store a partial packet content to improve the packet pre-processing performance of a network processing unit, and an associated network packet processing method.
According to a first aspect of the present invention, an exemplary network packet processing apparatus is disclosed. The exemplary network packet processing apparatus includes a first memory, a second memory, a direct memory access (DMA) controller, and a network processing unit (NPU). An access latency of the second memory is lower than an access latency of the first memory. The DMA controller is arranged to write a network packet into the first memory, and write a partial packet content of the network packet into the second memory. The NPU is arranged to read the partial packet content from the second memory, and perform packet pre-processing of the network packet according to the partial packet content.
According to a second aspect of the present invention, an exemplary network packet processing method is disclosed. The exemplary network packet processing method includes: writing a network packet into a first memory through direct memory access; writing a partial packet content of the network packet into a second memory through direct memory access, wherein an access latency of the second memory is lower than an access latency of the first memory; and reading the partial packet content from the second memory, and performing packet pre-processing of the network packet according to the partial packet content.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
Certain terms are used throughout the following description and claims, which refer to particular components. As one skilled in the art will appreciate, electronic equipment manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not in function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”. Also, the term “couple” is intended to mean either an indirect or direct electrical connection. Accordingly, if one device is coupled to another device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
The NPU 108 (particularly, packet pre-processing circuit 114 of NPU 108) may read the partial packet content PH from the memory 104, and perform packet pre-processing (e.g., pre-processing for packet forwarding) of the network packet PKT according to the partial packet content PH. Compared with reading data required by the packet pre-processing circuit 114 from the memory 102 with higher access latency, the present invention reads data required by the packet pre-processing circuit 114 from the memory 104 with lower access latency. This can greatly improve the packet pre-processing performance of the packet pre-processing circuit 114, thereby improving the packet forwarding performance of the subsequent network chip. The principle of the network packet processing apparatus 100 of the present invention will be described in detail below.
In this embodiment, the DMA controller 106 includes an address sniffing circuit 110 and a memory synchronization circuit 112. The address sniffing circuit 110 allocates a memory address to be monitored. For example, a sniffer list 109 can record at most N memory addresses addr1, addr2, . . . , addrn, . . . , addrN. The address sniffing circuit 110 can read the memory address to be monitored (e.g., addr1) from the sniffer list 109 through an index value IDX_R. In addition, during a process in which the DMA controller 106 performs a DMA operation upon the memory 102, the address sniffing circuit 110 further monitors at least one write address at which the DMA controller 106 performs writing upon the memory 102, and triggers the memory synchronization circuit 112 to write the partial packet content PH into the memory 104 when the memory address to be monitored hits (i.e., matches) the at least one write address. In this embodiment, before the network packet PKT is transmitted to the memory 102, the memory synchronization circuit 112 transmits the partial packet content PH to the memory 104, which ensures that the memory 104 has the partial packet content PH of the network packet PKT after the network packet PKT is written into the memory 102. In other words, the same partial packet content PH is stored in both of the memories 102 and 104 synchronously.
In addition to supporting the packet pre-processing function, the NPU 108 is further arranged to create and maintain the aforementioned sniffer list 109. For example, the sniffer list 109 may be stored in the memory 102. The sniffer list 109 may have a fixed length and may be programmed to have a plurality of entries 111 for recording a plurality of memory addresses addr1-addrN of the memory 102 that are available for a plurality of network packets, respectively. The NPU 108 first uses a default value (e.g., 0×0) to initialize all entries 111 in the sniffer list 109 (e.g., addr1=0×0, addr2=0×0, . . . , addrN=0×0), and sets the index value IDX_W by an initial value (e.g., IDX_W=1). Afterwards, the NPU 108 (particularly, packet pre-processing circuit 114 of NPU 108) refers to the index value IDX W to write a memory address of the memory 102 that is available for a network packet into the sniffer list 109, and updates the index value IDX_W (e.g., IDX_W=IDX_W+1) accordingly. In other words, the index value IDX_W is used to indicate which entry 111 in the sniffer list 109 now can be filled with a new memory address to be monitored.
In addition, the index value IDX_R is used to indicate which entry 111 in the sniffer list 109 now can be read by the address sniffing circuit 110 to allocate the memory address to be monitored by the address sniffing circuit 110. When at least one write address at which the DMA controller 106 performs writing upon the memory 102 hits the current memory address to be monitored that is allocated at the address sniffing circuit 110 (for example, the memory address to be monitored falls within a memory address range to be written by a DMA burst), the address sniffing circuit 110 updates the index value IDX_R (e.g., IDX_R=IDX_R+1) for reading a next memory address to be monitored from the sniffer list 109 to take the place of the current memory address to be monitored.
When the current index value IDX_R is equal to 1 (i.e., IDX_R=1), the address sniffing circuit 110 reads an entry indicated by the index value IDX_R=1 and obtains a memory address (i.e., addr1=0×0880) in the sniffer list 109 that is set by the packet pre-processing circuit 114 to act as a current memory address to be monitored, and updates the index value IDX_R to 2. When a new memory address to be monitored needs to be set subsequently, the address sniffing circuit 110 reads an entry indicated by the index value IDX_R=2 to obtain a memory address (i.e., addr1=0×1080) in the sniffer list 109 that is set by the packet pre-processing circuit 114 to act as the memory address to be monitored.
In some embodiments of the present invention, the NPU 108 (particularly, packet pre-processing circuit 114 of NPU 108) writes memory addresses into the sniffer list 109 through a data structure of a ring buffer, and the address sniffing circuit 110 reads memory addresses from the sniffer list 109 through the data structure of the ring buffer.
When the address sniffing circuit 110 detects a hit of a monitored memory address through memory address comparison, the address sniffing circuit 110 triggers the memory synchronization circuit 112 to write the packet content (e.g., header) required by packet pre-processing into the memory 104.
When the comparison of memory addresses indicates a hit of the monitored memory address, the address sniffing circuit 110 triggers the memory synchronization circuit 112 to write a partial packet content (e.g., a header) into the memory 104. At step S514, the NPU 108 (particularly, packet pre-processing circuit 114 of NPU 108) reads the partial packet content (e.g., header) from the memory 104 for packet pre-processing. At step S516, when forwarding of the packet in the memory 102 is completed, the storage space originally occupied by the packet can be released for use by a new packet received by a network port. Therefore, the NPU 108 (particularly, packet pre-processing circuit 114 of NPU 108) can store a new memory address into the ring buffer 302. That is, the new memory address is written into the sniffer list 109 maintained by the ring buffer 302 to act as a future memory address to be monitored.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202311644837.5 | Dec 2023 | CN | national |