The present invention relates to a network processor design, and more particularly, to a network processor using fake packet generation for improving processing efficiency of learning packets and an associated packet processing method.
With the rapid development of Internet, there are more and more networking applications. Hence, the broadband gateways need to handle many kinds of packet processing via different types of networking technologies. One typical network processor design may employ a network accelerator that can offload packet processing of a central processing unit (CPU) in a hardware manner. For example, the network accelerator records a hardware forwarding table. When a packet received by the network processor matches one forwarding rule included in the hardware forwarding table, the packet is directly forwarded by the network accelerator without intervention of the CPU. However, when the packet received by the network processor does not match any forwarding rule included in the hardware forwarding table, the packet is sent to the CPU for further processing, and the hardware forwarding table recorded in the network accelerator is updated by the CPU. The packet that cannot be directly forwarded by the network accelerator may be regarded as a learning packet from CPU's perspective. Packet loss may occur after the CPU cache is occupied by lots of learning packets waiting to be processed by the CPU. Thus, there is a need for an innovative network processor design with improved processing efficiency of learning packets.
One of the objectives of the claimed invention is to provide a network processor using fake packet generation for improving processing efficiency of learning packets and an associated packet processing method.
According to a first aspect of the present invention, an exemplary network processor is disclosed. The exemplary network processor includes a processor. The processor includes at least one processor core and a cache. The at least one processor core is arranged to load and execute program codes to deal with packet processing. The program codes comprise a network driver, a network stack of an operating system (OS) kernel, and a packet pre-learning module. The packet pre-learning module is arranged to generate a fake packet, and send the fake packet to the network stack of the OS kernel through the network driver. The cache is arranged to cache at least a portion of instructions and data associated with processing of the fake packet that is performed by the network stack of the OS kernel.
According to a second aspect of the present invention, an exemplary packet processing method is disclosed. The exemplary packet processing method includes: executing a network driver; executing a network stack of an operating system (OS) kernel; generating a fake packet, and sending the fake packet to the network stack of the OS kernel through the network driver; and caching at least a portion of instructions and data associated with processing of the fake packet that is performed by the network stack of the OS kernel.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
Certain terms are used throughout the following description and claims, which refer to particular components. As one skilled in the art will appreciate, electronic equipment manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not in function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”. Also, the term “couple” is intended to mean either an indirect or direct electrical connection. Accordingly, if one device is coupled to another device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
The network interface 106 is a hardware interface arranged to communicate with a first network NW1. The network interface 108 is a hardware interface arranged to communicate with a second network NW2. For example, one of the network interfaces 106 and 108 is a local area network (LAN) interface, and the other of the network interfaces 106 and 108 is a wide area network (WAN) interface.
The processor 102 is a host processor, and includes at least one processor core 110 and a cache 112. The processor 102 may be a single-core processor or a multi-core processor, depending upon actual design considerations. In one exemplary design, a single processor core may be represented by the processor core 110. In another exemplary design, multiple processor cores may be collectively represented by the processor core 110. The processor core 110 is arranged to load and execute program codes (i.e., software modules) to deal with packet processing. For example, the program codes may include a network stack 118 of an operating system (OS) kernel (e.g., a Linux kernel network stack) and a plurality of drivers 116, and the drivers 116 may include a network driver (e.g., a LAN/WAN driver 120) and a packet pre-learning module 122. The packet pre-learning module 122 is used for pre-learning (or pre-loading) of recently used instructions and data in the cache 112. This can reduce the processing time of a learning packet received from the network interface (e.g., LAN interface or WAN interface) 106 after a fake packet PKT_FK is generated from the packet pre-learning module 122.
The frame engine 104 is a network accelerator designed for hardware acceleration of the packet forwarding task, and includes a forwarding table 124. If a packet received by the network interface 106 has header information that matches a forwarding rule recorded in the forwarding table 124, the packet can be directly forwarded by the frame engine 104 according to the forwarding rule, without intervention of the processor 102. If a packet received by the network interface 106 has header information that does not match any forwarding rule recorded in the forwarding table 124, the packet is sent to the network stack 118 of the OS kernel 114 through the LAN/WAN driver 120 for further processing. Since a new forwarding rule can be learned from processing of the packet and then stored into the forwarding table 124 to facilitate forwarding of future packets received by the network interface 106, the packet not directly forwarded by the frame engine 104 may be regarded as a learning packet from processor's point of view. If the instructions and data needed for processing a packet received by the network interface 106 are available in the low-latency cache 112, the processor 102 does not need to retrieve the instructions and data from an external high-latency memory (which is large enough to store all instructions and data).
In this embodiment, the packet pre-learning module 122 is arranged to generate the fake packet PKT_FK that acts as a fake learning packet, and send the fake packet PKT_FK to the network stack 118 of the OS kernel 114 through the LAN/WAN driver 120. The cache 112 of the processor 102 is arranged to cache at least a portion (i.e., part or all) of instructions and data associated with processing of the fake packet PKT_FK that is performed by the network stack 118 of the OS kernel 114.
Suppose that the packet PKT_LN1 is received by the network interface (e.g., LAN interface or WAN interface) 106 before the fake packet PKT_FK is generated and sent to the network stack 118 of the OS kernel 114, and the packet PKT_LN2 is received by the network interface (e.g., LAN interface or WAN interface) 106 after the fake packet PKT_FK is generated and sent to the network stack 118 of the OS kernel 114. In one exemplary implementation, a configuration of the fake packet PKT_FK is based at least partly on the packet PKT_LN1. For example, the network interface 106 is a LAN interface and the network interface 108 is a WAN interface, each of the packets PKT_LN1 and PKT_LN2 is to be sent from a LAN side to a WAN side, and the fake packet PKT_FK is generated to simulate a packet to be sent from the LAN side to the WAN side. Fields in a header of the fake packet PKT_FK may include: destination medium access control (MAC) address=Customer Premise Equipment (CPE) LAN MAC (24:4b:fe:09:6b:c0), destination internet protocol (IP) address=WAN IP (192.168.50.10), source IP address=LAN IP (192.168.1.254), IP protocol type=UDP, and UDP data length=50. Suppose that the packet PKT_LN2 is a learning packet that is needed to be processed by the network stack 118 of the OS kernel 114. With a proper configuration of the fake packet PKT_FK which is sent to the network stack 118 before the packet PKT_LN2, a sequence of instructions invoked by the network stack 118 of the OS kernel 114 for processing the packet PKT_LN2 received by the network interface 106 may be identical to a sequence of instructions invoked by the network stack 118 of the OS kernel 114 for processing the fake packet PKT_FK generated by the packet pre-learning module 122. In this way, the packet processing efficiency of the packet PKT_LN2 can be improved due to the fact that the instructions required for processing the packet PKT_LN2 can be retrieved from the low-latency cache 112 without accessing the external high-latency memory such as a dynamic random access memory (DRAM). Further details of the packet pre-learning module 122 are described as below with reference to the accompanying drawings.
Since the fake packet PKT_FK is generated and sent to the network stack 118 of the OS kernel 114 periodically, the cache 112 can be ensured to hold frequently-used instructions and data of packet processing, including ip_rcv( ) ip_route_input_noref( ), ip_forward( ), dev_queue_xmit( ), skb(socket buffer) data structure, and associated skb operations such as skb_push( ), skb_pull( ), skb_reserve( ), skb_put( ), skb_clone( ), and skb_linearize( ). Since these instructions and the skb data structure are known to those skilled in the pertinent art, further description is omitted here for brevity.
It should be noted that these instructions mentioned above are for illustrative purposes only, and are not meant to be limitations of the present invention. For example, instructions involved in processing an IPv4 packet are not necessarily the same as that involved in processing an IPv6 packet. To put it simply, the present invention has no limitations on the instructions used for packet processing, and any network processor that loads and executes the proposed packet pre-learning module 122 for fake packet generation falls within the scope of the present invention.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2023/129654 | Nov 2023 | WO | international |