The present invention relates generally to compression with large block send, and more particularly to offloading Internet Protocol Payload Compression with Large Send to a network level.
The Internet remains a growing public network. Many companies rely on communication over the Internet using Internet Protocol (IP) to facilitate their business endeavors. However, public access also comes along with security risks. To address enhancement of security on the Internet, the Internet Engineering Task Force (IETF) proposed Internet Protocol Security (IPSec). IPSec is designed to provide authentication and encryption for communication over insecure networks, such as the Internet. However, once a packet is encrypted it cannot be compressed. Modems with built-in compression, such as V.42 for example, cannot compress an encrypted packet (due to the randomization of the data when it is encrypted), and thus throughput of such modems was slowed. Accordingly, the IETF proposed Internet Protocol Payload Compression (IPComp) to move compression up in the protocol stack, so that it can happen prior to encryption (instead of at the link level, below IP, as in modems).
IPComp allows systems to negotiate a type of compression for exchanging information prior to encryption. Unfortunately, implementations of IPComp require IPSec-capable computers, because IPComp negotiation is performed using the same negotiation protocol as IPSec, namely, Internet Key Exchange (IKE). Even though, IPComp relies on IKE, there is no reason that IPComp cannot be used independently of IPSec, without encrypting and/or authenticating communications. Unfortunately, in the Microsoft Windows Operating System, there is no Application Program Interface (API) for independently off-loading data for IPComp in operating systems (independently of IPSec, that is). Hereinafter, the term API is used to indicate an entire set of programs, definitions, protocols, subroutines, etc. in an interface, as well as indicate any particular program, definition, protocol, subroutine, etc. within an interface.
There is an API for offloading IPSec to an intelligent network interface (sometimes referred to as a “network interface card” or “NIC”). An intelligent NIC is used to do computationally intensive network stack operations rather than using the host's central processing unit (CPU). This frees up the CPU for other activities. For offloading IPSec, IPSec out-of-band data is created at an application level and passed down to a NIC for encryption on a packet-by-packet basis—the IP packet, including IPsec headers in their correct locations in the packet, is given to the intelligent NIC, along with an index (pointer) into the local “Security Association Database”, which contains connection-specific data, such as encryption keys and which encryption algorithm is in use for this connection.
At the high side, an amount of data handed down is equivalent to the largest physical packet size (measured in bytes) that a network interface can transmit, also known as the Maximum Transmission Unit (MTU). For example, the MTU for an Ethernet interface payload is 1,500 bytes, less overhead such as IP and TCP headers (typically 20 bytes each in the case of IPv4, or 40 bytes and 20 bytes for IPv6 and TCP, respectively), as well as any options in use. If Ethernet is used, packets of data may be handed down in blocks of about 1,500 bytes each. An additional 14 bytes and 4 bytes are appended to such a packet for an Ethernet header and trailer, respectively, and thus maximum packet size for an Ethernet packet is 1,518 bytes.
In IPSec, an Authentication Header (AH) and/or an Encapsulating Security Payload (ESP) header are optionally inserted in a packet, along with an ESP trailer—containing the Integrity Check Value (ICV)—if ESP-based Authentication has been negotiated for this connection. Additionally, if IPComp is in use, it will insert a Compression Header (CH) between the IPSec AH and/or ESP headers and the remainder of the packet. The addition of one or more of these headers results in adding more bytes to a packet. Continuing the above example for Ethernet, if payload handed down from an application level to a network interface level is 1,460 bytes, such a packet payload may have to be broken up or fragmented for transmission after the extra headers needed by IPSec, or IPSec and IPComp. However, the hope is that with IPComp, the packet payload will be reduced sufficiently to accommodate the additional headers and all of the original payload. Fragmentation should be avoided, if possible, because performance suffers, since fragmented packets will not have maximum payload usage.
An approach to address fragmentation is use of an API for “Large Send Offload” (also known as TCP Segmentation Offload) for the Transmission Control Protocol (TCP). There are Large Send API supports three component features (which can be used independently or together), namely, TCP Segmentation, TCP Checksum computation, and IP Checksum computation. For purposes of clarity, a Large Send API is used to refer to one or more APIs for initiating a Large Send. For Large Send offloads, a network driver is configured to inform a WinSock stack as to an MTU size. So for example, rather than 1,500 bytes for Ethernet, the network driver would indicate an MTU of 64 kilobytes (KB), or a large multiple of the actual packet payload capacity. In response to such configuration information, an application would thus send fewer, larger data blocks to the protocol stack—larger than can fit into the link's MTU.
Continuing the above example, data would be sent down to a NIC in blocks of approximately 64 KB. For a NIC with Large Send capacity, namely an intelligent NIC, a Network Driver Interface Specification (NDIS) layer provides an exemplary IP and TCP header and a pointer to the large block of data to a NIC driver. This driver divides up such data into path MTU-sized blocks, less any overhead, and sends out successive packets until the intelligent NIC has consumed the entire data block. Continuing the above example, if no options are used, overhead comprises TCP and IP headers totaling 40 bytes, so a 64 KB block of data would be divided as 64,000/1,460, resulting in 43 full packets and one “remainder packet”. If fragmentation had been necessary, each packet would have ended up as two fragments, for over 80 total packets. Thus, fewer packets are used, because more packets are fully loaded, such as for example approximately 1,460 bytes of data in each packet except perhaps the last packet, which includes however many bytes that remain after transmitting the rest of the data in the large block.
The initial large data block that is passed to the intelligent NIC includes a prototype TCP/IP header that will be used to build the header for each packet that is sent based on this data. Each Large Send packet will have a slightly different TCP and IP header provided by the NIC, derived from the prototype header, because for instance, TCP sequence numbers must be incremented by such MTU-sized blocks, e.g., by 1,460 for each packet, and the TCP checksum will be different for each packet, since it depends on the contents of the packet data. However the TCP Source and Destination ports will be the same in each derived packet. At the IP layer, the IP Identification field must be different for each unique packet that is sent, and the IP header checksum will be different for each packet as well, if only because the Identification field is different in each derived packet. Additionally, the calculation of the TCP checksum (which covers a 96-bit IP pseudo-header, the TCP header, and all TCP packet data) and the calculation of the IP header checksum (which covers the IP version 4 header but does not depend on packet data) are conventionally offloaded to the NIC driver. However, as noted above, each of the packets shares common information too, such as IP source address and IP destination address, among other common information as is known, for example the initial TTL.
However, APIs for Large Send do not provide support for IPComp. In fact, there is no API that allows an application to request that compression be offloaded to a lower layer entity, such as an NIC or similar component, for Large Send, i.e., there is no compression “on/off switch” for an application (independent of IPSec). Accordingly, it would be desirable and useful to provide IPComp in the context of Large Send offload capability, by enhancing the Large Send capability with the addition of a simultaneous attempt to negotiate compression, which if successful, would enable the Large Send data blocks to be transmitted using fewer packets.
An aspect of the present invention is a method for communicating application data for a communication session between two computers. A first portion of the application data is sent in an uncompressed form, where the first portion of the application data is provided by dividing a first large block of data into first smaller blocks of data. Protocol data is independently sent for determining whether a compression technology may be used, whereby an agreement for use of the compression technology may be established. A second portion of the application data is sent in a compressed form in response to an agreement to compress. The second portion of the application data is provided by compressing a subsequent large block of data to provide a compressed smaller block of data. The compression may be applied to the whole large block, or to each individual smaller block.
Another aspect of the present invention is a method for compressed Large Send. An intelligent network interface with Large Send and Internet Protocol Payload Compression capabilities is provided. A Large Send Application Protocol Interface (API) is initiated. Processing of uncompressed payload data by the network interface is initiated in response to the Large Send API, and an Internet Key Exchange API is initiated in response to the Large Send API (though the IKE negotiation is not specifically requested by the Large Send API). An Internet Protocol Payload Compression negotiation is initiated through usage of the Internet Key Exchange protocol. In response to successful conclusion of the Internet Protocol Payload Compression negotiation, a portion of the uncompressed payload data is compressed to provide a compressed data portion. The compressed data portion is sectioned to provide compressed data subsections.
So that the manner in which the above recited features, advantages and objects of the present invention are attained and can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings.
It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the present invention may admit to other equally effective embodiments.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the present invention.
Referring to
Referring to
In
Initiation of such a Large Send API is also used to initiate at step 313 an IPSec packet for IKE negotiation using an IPSec API. Such an IKE packet may be sent interleaved or pipelined with data being sent for the Large Send. Notably, IPSec negotiation must result in an agreement to do authentication and/or encryption, otherwise data may not be sent. Rather, IPSec negotiation is used to enable access to IPComp negotiation with a receiving computer.
At step 314, IPComp negotiation takes place. Again, IPComp capability 100 may be part of MCP 199 or part of application software. Notably, if IPComp negotiation can be accessed without first enabling IPSec negotiation, step 313 may be omitted, and a Large Send request to MCP 199 may be used to trigger IPComp capability 100. Step 312 can proceed in an interleaved manner (essentially, in parallel) with step 313 and/or step 314, so these steps should not be regarded as sequentially related. Notably, IPComp capability 100 may be embodied in hardware and/or firmware of MCP 199. Also, because MCP 199 has IPComp capability, IPComp may be transparently done with respect to an operating system of a sending computer other than initiation of a Large Send API.
At step 315, a check is made to determine if IPComp negotiation resulted in an agreement to compress using an agreed upon compression algorithm. If IPComp negotiation is unsuccessful, data continues to be sent in an uncompressed form, as indicated at step 316.
However, if IPComp negotiation is successful, namely, a compression algorithm is agreed upon between a sending computer and at least one receiving computer, then a data compression mode will be used. Notably, multicasting may be used with a compressed Large Send as described herein. For multicasting, agreement between a sending computer and receiving computers for compression is needed, and thus transmission of data, whether uncompressed or compressed, is to such receiving computers.
Optionally, at step 317, a block of data currently being processed after IPComp negotiation is completed may be sent in an uncompressed manner. A conventional Compression Parameter Index (CPI) is provided as part of the IPComp Compression Header, or CH. Nicely, a receiving computer in receipt of a packet without a CPI will simply process received data as uncompressed.
At step 318, a next large block of data is obtained. Because larger blocks of data, such as 64 KB blocks, may be compressed, compression efficiency is improved over compressing smaller blocks of data, such as 1.5 KB blocks. If block-level compression is not negotiated, packet-by-packet compression may also be used, with slightly worse efficiency. Since better compression ratios conventionally are achieved when applying a compression algorithm to a larger block of data, use of Large Send in combination with block-level compression would be advantageous. Thus, continuing the above example, a 64K block may be compressed down to 48K to 54K for example prior to dividing into 1.5K blocks for transmission, where approximately each 1.5 KB reduction is one less packet to be transmitted. Notably, by using compression, fewer packets may be used to send data to enhance system performance and reduce transmission bandwidth consumption.
However, it is possible that a block of data is not very compressible, so an optional threshold check at step 319 is used. If compression of a block of data is not reduced by at least a certain threshold percentage, then MCP 199 may avoid compression to avoid a performance penalty for decompression by a receiving computer. For example, a compression result of approximately 3% or less reduction in size, such as from 64K down to about 60K or more, may be insufficient to merit compression. If a compressibility threshold is not met, such a block is not sent in compressed form; rather, conventional Large Send processing takes place on such a block at step 320.
However, if a compressibility threshold is met at step 319, then at step 321 MCP 199 inserts an IPComp Compression Header for transmitting IPComp-compressed packets, as well as modifying other packet related information. When MCP 199 receives an IP packet for transmission, and when IPComp is in effect for that transmission flow, the IP header must be modified. As mentioned above, an IPComp header is inserted, also IPv4 Length and IPv6 Payload Length fields may no longer be correct so they will have to be modified, and an IPv4 header checksum will need to be re-calculated (for the final packet, intermediate packets will retain their original length, by inserting the Compression Header and then taking fewer bytes of compressed data to compensate for the additional bytes consumed by the IPComp header—resulting in a packet that is the same size as one that would be sent by Large Send without IPComp). IPComp is compatible with both IPv4 and IPv6, so modifications to such headers are similar. For IPv4, a Total Length field is changed to reflect a change in length of an IP packet, including IP header, inserted IPComp header, and length of compressed payload. In IPv6, a Payload Length field is modified to reflect length of compressed data in addition to an inserted IPComp header. An IPv4 Protocol field or an IPv6 Next Header field, as applicable, is changed from an old value to a value that indicates the presence of the Compression Header. The old value is remembered for use in an inserted IPComp header. When necessary, an IPv4 header checksum is calculated.
A receiving computer must be able to remove such IPComp header. To avoid fragmentation, a known number of bytes are held in reserve. For example, an IPComp header is 4 bytes, so payload compression must reserve 4 bytes per Large Send to ensure such an IPComp header is not fragmented from such a Large Send. A CPI field is used along with an IP Destination Address field to identify a compression algorithm in use. As stated above, it is possible to use IKE to negotiate compression associations, namely, agreement on a compression algorithm, independent of IPSec. Alternatively, negotiation between two or more IP endpoints or network nodes of a CPI value, and thus an associated compression transform for such endpoints, may ride upon an existing IKE session being used for IPSec Security Association (SA) negotiations between such endpoints. As is known, CPI values of 0-63 are reserved for well-known compression algorithms, of which values 0-4 have been allocated as set forth in Table I. Notably, IPComp does not mandate a default compression algorithm or transform. Thus, if two IPComp-capable nodes do not have at least one common transform, they will not be able to exchange compressed data using IPComp.
Moreover, as MCP 199 is configured to do compression, CPU time is not consumed for compressing data at step 321. Furthermore, Large Send capability may be combined with multicasting capability of MCP 199 for doing a multicast Large Send with compressed data at step 321.
Some embodiments of the present invention are program products that may reside in whole or in part in local memory 102 of MCP 199 and/or system memory 13. By way of example and not limitation, memory may be sufficient to hold at least a portion of communication process 310 in accordance with one or more embodiments of the present invention. Memory may comprise volatile and/or non-volatile memory, including but not limited to magnetically readable memory (e.g., floppy disk, hard disk, and the like), optically readable memory (e.g., CD-ROM, -RW, DVD-ROM, -RAM, and the like), and electrically readable memory (e.g., DRAM, SRAM, EEPROM, registers, latches, and the like). Accordingly, some embodiments of the invention are program products containing machine-readable programs. The program(s) of the program product defines functions of the embodiments and can be contained on a variety of signal/bearing media, which include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct the functions of the present invention, represent embodiments of the present invention.
It should be appreciated that within a communication session data may be converted from being sent in an uncompressed form to being sent in a compressed form and vice versa within the middle of such a communication session. Because IPComp negotiation is conducted in parallel with the transmission of data, latency of data transmission is reduced as compared with IPSec wherein agreement must be completed first prior to sending data in an encrypted form. Moreover, it should be appreciated that a NIC in a sending computer of communicating computers combines Large Send and compression without need for operating system intervention and without need for CPU usage. Offloading compression of large blocks to a NIC is a significant performance enhancement (due to further reduction in CPU utilization) in addition to having to send fewer packets.
While foregoing is directed to the preferred embodiment of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. All trademarks are the respective property of their owners. Claims listing steps do not imply any order of the steps.
Number | Name | Date | Kind |
---|---|---|---|
5742773 | Blomfield-Brown et al. | Apr 1998 | A |
5826027 | Pedersen et al. | Oct 1998 | A |
5884025 | Baehr et al. | Mar 1999 | A |
6032253 | Cashman et al. | Feb 2000 | A |
6047325 | Jain et al. | Apr 2000 | A |
6055236 | Nessett et al. | Apr 2000 | A |
6094485 | Weinstein et al. | Jul 2000 | A |
6101170 | Doherty et al. | Aug 2000 | A |
6141705 | Anand et al. | Oct 2000 | A |
6189102 | Beser | Feb 2001 | B1 |
6215907 | Kumar et al. | Apr 2001 | B1 |
6275588 | Videcrantz et al. | Aug 2001 | B1 |
6304573 | Hicks, III | Oct 2001 | B1 |
6327660 | Patel | Dec 2001 | B1 |
6389419 | Wong et al. | May 2002 | B1 |
6449647 | Colby et al. | Sep 2002 | B1 |
6496867 | Beser et al. | Dec 2002 | B1 |
6542935 | Ishii et al. | Apr 2003 | B1 |
6542992 | Peirce, Jr. et al. | Apr 2003 | B1 |
6615357 | Boden et al. | Sep 2003 | B1 |
6629150 | Huded | Sep 2003 | B1 |
6646999 | Kato et al. | Nov 2003 | B1 |
6647428 | Bannai et al. | Nov 2003 | B1 |
6658463 | Dillon et al. | Dec 2003 | B1 |
6704794 | Kejriwal et al. | Mar 2004 | B1 |
6714985 | Malagrino et al. | Mar 2004 | B1 |
6735647 | Boyd et al. | May 2004 | B2 |
6757746 | Boucher et al. | Jun 2004 | B2 |
6781955 | Leung | Aug 2004 | B2 |
6832261 | Westbrook et al. | Dec 2004 | B1 |
6879266 | Dye et al. | Apr 2005 | B1 |
6888835 | Reeve et al. | May 2005 | B2 |
6889385 | Rakib et al. | May 2005 | B1 |
6904519 | Anand et al. | Jun 2005 | B2 |
6907042 | Oguchi et al. | Jun 2005 | B1 |
6909714 | Abrol et al. | Jun 2005 | B2 |
6912522 | Edgar | Jun 2005 | B2 |
6950862 | Puthiyandyil et al. | Sep 2005 | B1 |
6957346 | Kivinen et al. | Oct 2005 | B1 |
7010727 | Stucker | Mar 2006 | B1 |
7017175 | Alao et al. | Mar 2006 | B2 |
7027443 | Nichols et al. | Apr 2006 | B2 |
7116640 | Tasman et al. | Oct 2006 | B2 |
7136926 | Iyer et al. | Nov 2006 | B1 |
20010032254 | Hawkins | Oct 2001 | A1 |
20010038642 | Alvarez et al. | Nov 2001 | A1 |
20010054131 | Alvarez et al. | Dec 2001 | A1 |
20020009083 | Ambe et al. | Jan 2002 | A1 |
20020009136 | Heath | Jan 2002 | A1 |
20020046348 | Brustoloni | Apr 2002 | A1 |
20020078242 | Viswanath | Jun 2002 | A1 |
20020083344 | Vairavan | Jun 2002 | A1 |
20020089979 | Abdulkader | Jul 2002 | A1 |
20020133534 | Forslow et al. | Sep 2002 | A1 |
20020133598 | Strahm et al. | Sep 2002 | A1 |
20020136210 | Boden et al. | Sep 2002 | A1 |
20020138848 | Alao et al. | Sep 2002 | A1 |
20020162026 | Neuman et al. | Oct 2002 | A1 |
20020169885 | Alao et al. | Nov 2002 | A1 |
20020172164 | Chou et al. | Nov 2002 | A1 |
20030007486 | March et al. | Jan 2003 | A1 |
20030012190 | Kaku et al. | Jan 2003 | A1 |
20030028606 | Koopmans et al. | Feb 2003 | A1 |
20030110296 | Kirsch et al. | Jun 2003 | A1 |
20030142823 | Swander et al. | Jul 2003 | A1 |
20030145226 | Bruton et al. | Jul 2003 | A1 |
20030146907 | Boals et al. | Aug 2003 | A1 |
20030154399 | Zuk et al. | Aug 2003 | A1 |
20030179713 | Fleming | Sep 2003 | A1 |
20030197719 | Lincke et al. | Oct 2003 | A1 |
20030233568 | Maufer et al. | Dec 2003 | A1 |
20030233576 | Maufer et al. | Dec 2003 | A1 |
20040030927 | Zuk | Feb 2004 | A1 |
20040114589 | Alfieri et al. | Jun 2004 | A1 |
20050182854 | Pinkerton et al. | Aug 2005 | A1 |
20050281288 | Banerjee et al. | Dec 2005 | A1 |
Number | Date | Country |
---|---|---|
1130846 | Sep 2001 | EP |
WO 9935799 | Jul 1999 | WO |
WO 0056034 | Sep 2000 | WO |
WO 0167258 | Sep 2001 | WO |
WO 0176191 | Oct 2001 | WO |