The present disclosure relates to devices that process packets transmitted according to session oriented protocols.
Acceleration for session oriented protocols, such as the Transmission Control Protocol (TCP), requires an offload engine that is essentially a computer on the network interface card to offload the computing operations that would otherwise be required by the host device. The cost and performance benefits of an offload have been suspect at best. The offload engine copies packets to buffers at the application layer and that in turn requires every packet to be subjected to read and write operations. Copying the payload of packets to the application buffer utilizes the central processing unit. Higher central processing unit utilization is undesirable because it eventually results in lower throughput, limiting the availability of the central processing unit to application layer operations.
Overview
Techniques are provided for zero copy accelerated processing of packets received at a network device according to a session oriented protocol. Each packet comprises a header field and a payload field. Data in the header field of a packet is evaluated to determine whether a sequence number in the header field is equal to an expected sequence number for a given flow of packets. When the sequence number in the header field is equal to the expected sequence number, header data from the header field is stored in a header ring comprising a plurality of socket buffers and payload data is directed to an application buffer pool according to a pointer in a streaming data ring. When the sequence number in the header field is not equal to the expected sequence number, the header data and the payload data are stored in the header ring.
Example Embodiments
Referring first to
Reference is now made to
Reference is now made to
The header of each packet is stored in one of the socket buffers 122(1)-122(M). The payload of each packet is appended to the streaming data ring 130. Pointers stored in locations of the streaming data 130 point to buffers in the application buffer pool 108.
A processor 140, e.g., microprocessor or microcontroller, performs certain operations described herein by software stored in memory 150. For example, memory 150 stores instructions for packet processing logic 200. The operations of the packet processing logic 200 are described hereinafter in connection with
The memory 150 may include read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. In general, these memory devices may comprise one or more computer readable storage media encoded with software comprising computer executable instructions and when the software is executed, the processor that executes that software (e.g., processor 140) is operable to perform the operations described herein for the packet processing logic 200.
The network stack configuration shown in
The techniques described herein use a classifier that isolates flows into queues. The queues take streaming buffers, i.e., the streaming data ring, instead of traditional discrete buffers. A counter mechanism is used to check for byte offsets in a flow and to use that offset to land the bytes in the streaming data ring. Software analyzes the content of the header socket buffers to put header data in a separate queue from the payload data if the flow is well formed and otherwise to put the header data and payload data into a header socket buffer for processing if there is an out of sequence conditions or other packet errors. Using this scheme, software in the normal path (bytes that are not out of sequence) will only check the headers in the header queue to validate the bytes in the streaming data ring queue. The payload data is directed to the application buffer pool without being copied into and out of memory and thus avoiding the need for processor read and write operations.
Hardware control of the operation of the components shown in
When, at 162, it is determined that the sequence number of the packet is not equal to the expected sequence number, then processing of the packet proceeds to operation 172. At 172, instead of directing the payload data directly to the application buffer pool, the payload and the header of the packet are sent to the header ring placing the header and payload a socket buffer. At 174, software operations are performed to fix the streaming data ring due to an out of sequence packet. Processing then continues to 168 and 170 after operation 174. The software operations at 174 correspond to operations of the packet processing logic 200 referred to in connection with
Turning now to
When, at 222, it is determined that there is header and payload data in the header ring, then processing proceeds along the path to the right in
The following is an example of operations performed when two packets, Packet 1 and Packet 2, arrive out of order, i.e., Packet 2 arrives before Packet 1. Reference is made to
The foregoing presents a packet processing approach that does not require a substantial number of digital logic gates for implementation and accelerates processing of packets communicated according to a session oriented protocol, such as TCP. An example of another session oriented protocol to which these techniques are useful is the Real-time Transport Protocol (RTP) running over the User Datagram Protocol (UDP). These packet processing techniques address the low latency high throughput problem associated with Data Center Ethernet clusters, and achieve closer to ideal performance in terms of throughput and CPU utilization.
Thus, in summary, a scheme is provided herein to perform zero copy offload of session packets without the need for a protocol engine. These techniques are inexpensive to implement and can be used to accelerate any session oriented protocol that uses sequence numbers that are packet offsets. Examples of uses for these techniques include accelerating TCP processing of packet; accelerating data movement in Internet Small Computer System Interface (iSCSI), an Internet Protocol (IP)-based storage networking standard for linking data storage facilities; direct memory access to user space for virtual machines; on-loading in Microsoft's so-called “Chimney” architecture; and any low-latency/high-message rate applications.
Accordingly, in one form, a method is provided comprising at a network device, receiving packets of data sent using a session oriented protocol, each packet comprising a header field and a payload field; evaluating data in the header field of a packet to determine whether a sequence number in the header field is equal to an expected sequence number for given flow of packets; when the sequence number in the header field is equal to the expected sequence number, storing header data from the header field in a header ring comprising a plurality of socket buffers and directing payload data to an application buffer pool according to a pointer in a streaming data ring; and when the sequence number in the header field is not equal to the expected sequence number, storing the header data and the payload data in the header ring.
In another form, an apparatus is provided comprising a first ring buffer comprising a plurality of socket buffers; a second ring buffer; a buffer pool; and a processor. The processor is configured to evaluate data in a header field of packets of data received according to a session oriented protocol, to determine whether a sequence number in the header field is equal to an expected sequence number for a given flow of packets; when the sequence number in the header field is equal to the expected sequence number, store header data from the header field to one of the plurality of socket buffers of the first ring buffer and direct payload data to the buffer pool according to a pointer in the second ring buffer; and when the sequence number in the header field is determined to not be equal to the expected sequence number, store the header data and payload data in one of the plurality of socket buffers of the first ring buffer.
In yet another form, one or more computer readable storage media are provided that are encoded with software comprising computer executable instructions that when executed are operable to: at a network device, receive packets of data sent using a session oriented protocol, each packet comprising a header field and a payload field; evaluate data in the header field of a packet to determine whether a sequence number in the header field is equal to an expected sequence number for given flow of packets; when the sequence number in the header field is equal to the expected sequence number, store header data from the header field in a header ring comprising a plurality of socket buffers and direct payload data to an application buffer pool according to a pointer in a streaming data ring; and when the sequence number in the header field is not equal to the expected sequence number, store the header data and the payload data in the header ring.
The above description is intended by way of example only.
Number | Name | Date | Kind |
---|---|---|---|
6157955 | Narad et al. | Dec 2000 | A |
6246683 | Connery et al. | Jun 2001 | B1 |
6832261 | Westbrook et al. | Dec 2004 | B1 |
6876657 | Palmer et al. | Apr 2005 | B1 |
7752360 | Galles | Jul 2010 | B2 |
8391302 | Kommidi et al. | Mar 2013 | B1 |
20030026277 | Pate et al. | Feb 2003 | A1 |
20040013117 | Hendel et al. | Jan 2004 | A1 |
20040057434 | Poon et al. | Mar 2004 | A1 |
20050223128 | Vasudevan et al. | Oct 2005 | A1 |
20060072564 | Cornett et al. | Apr 2006 | A1 |
20080080514 | Louzoun et al. | Apr 2008 | A1 |
20080126622 | Tamir et al. | May 2008 | A1 |
20080130894 | Qj et al. | Jun 2008 | A1 |
20090248891 | Tanaka | Oct 2009 | A1 |
20110153771 | Lin et al. | Jun 2011 | A1 |
20120173772 | Durand et al. | Jul 2012 | A1 |
Entry |
---|
Cisco Data Sheet, Cisco UCS M81KR Virtual Interface Card, 2010 (4 pages). |
Number | Date | Country | |
---|---|---|---|
20130007296 A1 | Jan 2013 | US |