Accelerating data transfer in a virtual computer system with tightly coupled TCP connections

Information

  • Patent Grant
  • 8893159
  • Patent Number
    8,893,159
  • Date Filed
    Monday, September 9, 2013
    11 years ago
  • Date Issued
    Tuesday, November 18, 2014
    10 years ago
Abstract
First and second operating systems of a virtual computer system can communicate using respective first and second network protocol stacks, by employing procedures that are specialized for a situation in which a TCP control block of the first stack and a TCP control block of the second stack correspond to the same logical connection. In this case, various TCP requirements can be bypassed by coupling the TCP control blocks, reducing or eliminating data copies and providing other efficiencies.
Description
BACKGROUND

Virtual computer systems, in which more than one operating system runs on a computer system, have been known for decades. Variations of virtual computer systems include architectures in which plural operating systems run on a single processor, plural operating systems run on plural processors, and plural operating systems run on plural processors that are connected by an input/output (I/O) bus such as a personal computer interconnect (PCI) bus.


Virtual computer systems include architectures in which one or more of the operating systems runs above a native operating system, and architectures in which plural operating systems run above a virtual machine monitor (VMM) or hypervisor layer. Such a VMM or hypervisor can provide a common platform for those operating systems that run above it, and a VMM or hypervisor layer may emulate hardware to the operating systems running above it.


As with other conventional computer systems, an operating system for a virtual computer system may contain a file system that organizes data stored on a disk or other storage system, and a network protocol stack for communicating, via a network interface device, with other entities over a network. When different operating systems of a virtual computer system wish to communicate with each other, for example to exchange data, they typically do so via networking protocols.


Certain networking protocols, such as Transmission Control Protocol (TCP), provide guaranteed delivery of data and other features that require significant computing resources to run. For example, TCP requires a complex control block, sometimes called a TCP control block or TCB, to be maintained at a network node such as a computer system for each logical connection that is set up to provide TCP services. Such a TCB contains status information that fully describes the logical connection from the standpoint of the node by which it is maintained, and so can also be called a TCP connection. An exemplary TCP control block is discussed and illustrated in chapter 24, pages 795-815 of “TCP/IP Illustrated, Volume 2,” Wright and Stevens (1994), which is incorporated herein by reference. Due to the resources required to run TCP, some network interfaces, whether provided as add-on cards or board-level products such as chipsets, have processors or other hardware that offload processing of TCP from a central processing unit (CPU) of the computer system.


When different operating systems of a virtual computer system exchange data using TCP, the resources required of the virtual computer system to exchange the data are typically doubled in comparison with a computer system that runs only one of the operating systems communicating over a network. For example, to send data by TCP, a first operating system of the virtual computer system establishes a TCP connection with a second operating system of the VC system, after which the first application may request to send data to the second application. The data is then acquired by the network stack of the first operating system and split into TCP/IP segments which are prefixed with TCP/IP headers including checksums of both the data and the headers, each step of which can include copying the data by the CPU of the VC. Each of the packets containing headers and data is then prefixed with a data link layer header and transmitted on the network, only to be received from the network by the same VC system, which essentially reverses the process performed by the first operating system and first protocol stack, in order to receive the data by the second operating system and second protocol stack. That is, the second protocol stack analyzes the headers and checksums of each received data packet, and reassembles the data from the packets, and then provides the data to the second application, each step of which can again include copying the data by the CPU of the VC.


Because the data to be sent from a first application running on a first operating system of a VC to a second application running on a second operating system of the VC may be stored on a memory that can be accessed by both operating systems, proposals have been made to transfer data between guest operating systems by memory mapping procedures instead of using network protocols such as TCP. While such memory remapping could eliminate much of the double copying described above, the logistics are complex and perhaps for this reason are not commonly implemented.


SUMMARY

In one embodiment, first and second operating systems of a virtual computer system can communicate using respective first and second network protocol stacks, by employing procedures that are specialized for a situation in which a TCP control block of the first stack and a TCP control block of the second stack correspond to the same logical connection. In this case, various TCP requirements can be bypassed by coupling the TCP control blocks, reducing or eliminating data copies and providing other efficiencies. This brief summary does not purport to define the invention, which is described in detail below and defined by the claims.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows a computer system having plural operating systems communicating with each other using tightly-coupled TCP and a hypervisor.



FIG. 2 shows a computer system having plural operating system communicating with each other using tightly-coupled TCP and a network interface card.



FIG. 3 shows a computer system having plural operating system communicating with each other using tightly-coupled TCP that is transferred between a hypervisor and a network interface card.





DETAILED DESCRIPTION


FIG. 1 shows a computer system 20 having a processor 22 and a memory 24. Although a single processor and memory are shown to facilitate understanding, plural processors and/or plural memories may be employed in place of those individual elements, in this figure as well as in subsequent figures. The computer system 20 is connected to a network interface 26 by an I/O channel 28 such as a PCI bus, and the network interface 26 is connected to a network 30.


The computer system 20 is running a first operating system 33 and a second operating system 44, and may be called a virtual computer system. The first operating system 33 and second operating system 44 both run on processor 22 with their instructions stored in memory 24 in this embodiment, although in other embodiments the operating systems may run on different processors and/or be stored on different memories. The first operating system 33 has a first network stack 35 that includes conventional components such as a first TCP layer and a first IP layer. The second operating system 44 has a second network stack 46 that includes conventional components such as a second TCP layer and a second IP layer.


The first operating system 33 and the second operating system 44 run over a VMM or hypervisor 50 that allows both operating systems to be part of the same computer system 20. Although hypervisor 50 is shown simply as a platform for the operating systems, it may also or instead be a native operating system above which the first operating system 33 and second operating system 44 both run. A device driver 55 allows the hypervisor 50 and operating systems 33 and 44 to interact with the network interface 26. Although the device driver 55 is shown as a layer of instructions below the hypervisor 50, individual device drivers may instead be provided for the first and second operating systems 33, or the device driver 55 may be incorporated in the hypervisor 50. Similarly, although the network interface 26 is shown as a separate entity in FIG. 1, it may be considered part of computers system 20, and may be connected to processor 22 and memory 24 by an internal computer bus rather than an I/O channel.


A first application 60 or other process is running above the first operating system 33, and a second application 62 or other process is running above the second operating system 44. In order to communicate between the first application 60 and the second application 62, the first operating system 33 may use the first network stack 35 and the second operating system 44 may use the second network stack 46. For example, a logical connection may be established between the first TCP layer, which is part of the first network stack 35, and the second TCP layer, which is part of the second network stack 46. To maintain that connection, the first TCP layer creates a first TCP control block or first TCB 64. Because first TCB 64 fully characterizes the state of the logical connection from the standpoint of the first operating system 33 and first network stack 35, it may also be called a TCP connection. Similarly, the second TCP layer creates a second TCP control block or second TCB 66, which fully characterizes the state of the logical connection from the standpoint of the second operating system 44 and second network stack 46, and may also be called a TCP connection.


The TCBs 64 and 66, like all TCP control blocks, can be identified by their source and destination IP addresses and by their source and destination TCP ports. Unlike most or all other TCP control blocks that may be contained in memory 24, however, TCBs 64 and 66 can be identified by each other. That is, because first TCB 64 and second TCB 66 represent two sides of the same logical connection, TCB 64 and TCB 66 are in many aspects mirror images of each other. For example, Table 1 and Table 2 below show that the identifying source and destination IP addresses and TCP ports (sometimes called a four-tuple) of first TCB 64 and second TCB 66 are mirror images of each other.









TABLE 1





TCB 64


















Source IP Address
A



Destination IP Address
B



Source TCP Port
X



Destination TCP Port
Y

















TABLE 2





TCB 66


















Source IP Address
B



Destination IP Address
A



Source TCP Port
Y



Destination TCP Port
X










While the mirror image four-tuples of TCB 64 and TCB 66 allow the pair of TCBs to be identified as belonging to the same logical connection, other aspects of the reciprocal relationship between the first and second TCP connections can be exploited to violate some of the rules of TCP without sacrificing any TCP attributes, providing greatly accelerated data transfer with greatly reduced work by processor 22. That is, because the TCP state is located and referenced by the same set of instructions, conventional TCP processing can modified to reference and update both TCP control blocks essentially simultaneously. In doing so, traditional TCP processing changes radically. The transmit payload can be transferred directly to a receive buffer in the peer without first segmenting it into MSS-sized packets. Furthermore, the need for ACKs and window updates is eliminated—instead, fields like SndUna and SndWnd can be updated directly in the sender's TCB based on the state of the receiver. These types of modifications for “tightly coupled” TCP connections may also be employed for other network protocols used by guest operating systems of a virtual computer system.


As described in more detail below, tightly coupled TCP communication may be implemented using a hypervisor or other common entity in a virtualized system, as shown by arrow 77, as well as with an offload device such as a transport offload engine (TOE) network interface card (NIC) or other device, an example of which is shown in FIG. 2. In any event, one area of interest is the recognition that two ends of a logical connection such as a TCP connection exist on the same computer system, so that the reciprocal TCBs can be coupled together and communication between the ends of the logical connection accelerated. Because the TCBs are owned by different operating systems, coupling them together first requires that the TCBs are both offloaded to an entity that can control them both, such as the hypervisor or offload device, and as a result can be tightly coupled. In this sense, offloading the TCBs merely means that they are controlled by a different process than the respective TCP layers that established them. Offloading may also mean that the TCBs have been copied to a different part of memory from that in which they were established, to facilitate their being referenced by the same logical code. There may be other reasons to offload one or both TCBs in addition to affording tight coupling, and so the condition in which one or both TCBs have been offloaded may exist before or after the identification of the TCBs as being reciprocal.


The identification of reciprocal TCBs may be performed in various ways and at different times. For example, the hypervisor 50 or device driver 55 can monitor connection establishment packets (SYNs) or other TCB related packets and provide a notification up to the network stacks 35 and 46 that they should offload the connections associated with the IP addresses and ports contained in the packets. Alternatively, at a time when one of the network stacks 35 or 46 offloads a connection to hypervisor 50, the hypervisor can check to determine whether it also controls the other end of that connection. It is also possible for the hypervisor 50 to check for reciprocal TCBs when a request for data transfer is sent by one of the applications or other processes to the other application or process. In either of the latter categories, the identification of TCB reciprocity can entail searching a list of offloaded connections to find two TCBs with matching IP addresses and TCP ports, so that the IP addresses and TCP ports are mirror images of each other. Such a search may be implemented as a linear search of all offloaded TCBs, or it may involve a hashing mechanism on some or all of the four-tuple to reduce the overhead of the search.


For a dynamic offload NIC, illustrated and discussed more fully below with respect to FIG. 2, any of the above methods for identifying reciprocal TCBs may be employed. For a full offload NIC, in which TCP connections are established and maintained on the NIC rather than dynamically offloaded to the NIC, also discussed more fully below, the possibility of identifying TCB reciprocity during the offloading of a TCP connection to the NIC is not present.


Once a pair of TCBs has been identified as reciprocal, those TCBs may be flagged as tightly coupled and linked together via a pointer, for example. To link the TCBs together by a pointer, for example, first TCB 64 can point to the location in memory of second TCB 66, and visa versa. Should these connections be flagged as tightly coupled and/or linked together in this fashion, corresponding code is run by the hypervisor 50 or other entity that accesses both TCBs to remove this flag and/or break the link when one or both of these connections is “uploaded” from the hypervisor or offload device. In this sense, uploading the TCBs means that their control is returned to the respective TCP layers that established them. Such an uploaded TCB may be flagged as having been previously tightly coupled, with an indication to look for the reciprocal TCB to facilitate tight coupling in the future, should it be offloaded again. In some cases it may be desirable to send a request from the hypervisor 50 to a network stack to offload a TCB that is reciprocal to one already controlled by the hypervisor, especially if the TCB controlled by the hypervisor is flagged as having been previously tightly coupled.


Synergistic advantages of tightly coupled TCP can be realized when one or the other sides of a tightly coupled logical connection performs an I/O operation such as data transfer from one side to the other. This is shown symbolically in FIG. 1 as transfer 80 of first data 70, which is under control of first application 60, to second data 72, which is under control of second application 62. As described in the previous section, a first step in this operation is the recognition that both ends of the connection are found in the same system, either by packet monitoring, four-tuple lookup or via flags or other identification mechanisms. Tightly coupled I/O involves essentially simultaneously sending from one side while receiving on the other. This can occur, for example, when a single function call is applied to both sides. For example, a function call might be written to perform portions of both the FreeBSD tcp_input and tcp_output operations on both TCBs simultaneously. It is still useful, however, to discuss the procedures from the viewpoint of both the sending and receiving sides of the connection, even though sending and receiving may not happen in a traditional sense.


The receiving side of tightly coupled TCP may involve different modes of operation. For instance, a receiving application may have “posted” one or more buffers to the operating system in which it would like received data to be placed. That buffer can then propagated to the underlying hypervisor or offload device that will manage the tightly coupled connection. We call this the “posted buffer” mode. Conversely, if no receive buffer has been posted, received data might be “indicated” from the underlying hypervisor or offload device, up to the guest operating system, and in turn to the application. We call this the “indication” mode.


In the “posted buffer” mode, the hypervisor or offload device may have access to the memory containing the data to be sent and the buffer in which to place it. Data transfer then involves a copy or DMA from the first location to the second with a length being the minimum of those two buffers. A DMA engine may be part of a chipset for processor 22, so that DMA transfer can occur without data crossing the I/O channel 28. Upon completion of that copy or DMA the state within both TCBs is updated to reflect this data transfer. For instance, on the sender, TCB fields such as SndNxt, SndMax and SndUna are all advanced by the length of data transferred. Note that by advancing SndUna we consider the data to be instantly acknowledged, although no acknowledgement packets are sent or received. Similarly, the TCB of the receiver is modified to advance RcvNxt by the same value. In this “posted buffer” mode, the state of each TCB is adjusted to reflect the fact that the window does not close as a result of this data transfer. This is due to the fact that the posted receive buffer is from the receiving application, and as such, the data is considered to be consumed by the application.


In the “indication” mode, the data is placed into network buffers controlled by the receiver. These buffers are typically allocated by a network device driver and then propagated up to the protocol stack or application, at which point the data contained in them is either “consumed” by copying the data (with the CPU) to another buffer, or “refused” (not consumed). In the event that data is not consumed, a “posted buffer” may be passed down instead.


There are several things to note about data transfer in “indication” mode. First, like the “posted buffer” mode, the state variables, including SndUna and RcvNxt are advanced by the amount of data transferred, and reflect the same “instant acknowledgment” as described above. One difference, however, is that in this case the data is not considered to be consumed by the application, and as such, the state variables are adjusted to reflect a closing receive window on the receiver. That window is opened via a subsequent notification to the hypervisor or offload device that data had been copied out of the network buffers into an application buffer. A second thing to note about “indication” mode is that it may be desirable to withhold some of the send data with the expectation that a “posted buffer” may be presented in response to the indicated data. Note further that in indication mode the possibility exists that receiver may be out of network receive buffers at the time that the sender wishes to send data. In this case, the hypervisor or offload engine may simply choose to delay the send until such a time as network buffers become available. The underlying hypervisor or offload engine would then be responsible for keeping track of which send operations are pending such that when network buffers are subsequently provided by the receiver, the pending send operations are then restarted.


The “posted buffer” mode is preferential to the “indication” mode in that data is moved directly to the application buffer and avoids a copy from the intermediate network buffer. As such, if there is a likelihood that a buffer will be posted, it is preferable to wait for that to occur.


It is worth noting that no packets of any kind need be sent or received for either TCP connection once the reciprocal TCBs are tightly coupled, in contrast to the double copying of each packet that may occur for communication between guest operating systems of a virtual computer system. A DMA operation, whether performed by a DMA engine on network interface 26 or computer system 20, can eliminate even the reduced copying between buffers that may otherwise be performed by a CPU such as processor 22. Other aspects of TCP are also altered or eliminated through the use of “tightly-coupled” TCP. Retransmission timers (and retransmits), window probes and keepalives are eliminated. Round trip timers are set to a fixed minimum and held there (no calculations performed). Slow-start, congestion-control, and error recovery mechanisms (new reno, sack) are bypassed entirely.


It should be noted, however, that while we are altering TCP behavior as it applies to communication within these two tightly coupled TCB's, the state of each TCB is maintained such that conventional TCP processing may resume at any time—which would be required should one or both TCB's be uploaded by its respective guest operating system.


Tightly coupled TCP operations may also include a TCP state change. For instance, one side of the connection might elect to close (disconnect) the connection. Like the data transfer discussion, this can involve simultaneously adjusting the state variables to reflect the transfer of a single byte (a FIN takes one sequence number). It also involves a state change on both halves of the connection. The state of the side that sent the FIN is changed to FIN_WAIT2, while the state of the receiver is immediately changed to CLOSE_WAIT. Note that the state of the sender skips over FIN_WAIT1 since the FIN is considered to be ACKed immediately.


This operation also requires an indication of the FIN, or an analogous disconnection notification, on the receiving side up to the protocol stack of the guest operating system. This indication may ultimately result in a close request in response, or possibly an upload of the connection. In the case of the close request, this would result in a similar state change operation, except that this time the sender of the FIN would move immediately from CLOSE_WAIT to CLOSED, skipping the LAST_ACK state due to the “immediate acknowledgement”, while the receiver of the FIN would move from FIN_WAIT2 to TIME_WAIT.



FIG. 2 shows a computer system 100 having a first computer 102 and a second computer 104 that are connected to a network interface 126 by an I/O channel 128 such as a PCI bus, with the network interface 126 connected to a network 130. The first computer 102 has a first processor 112 and a first memory 114. The first computer 102 is running a first operating system 133 with a first network stack 135 that includes conventional components such as a first TCP layer and a first IP layer. A first device driver 138 is running below the first network stack 135 and a first application or other process 160 is running above the first operating system 133.


The second computer 104 has a second processor 122 and a second memory 124. The second computer 104 is running a second operating system 144 with a second network stack 146 that includes conventional components such as a second TCP layer and a second IP layer. A second device driver 155 is running below the second network stack 146 and a second application or other process 162 is running above the second operating system 144.


The NIC 126 includes a NIC processor 114 and a NIC memory 116, as well as a DMA engine 180 that can access computer memories 114 and 124. The NIC 126 includes a network protocol stack 114 or hardware that can perform network protocol functions, including at least a subset of TCP functions. The NIC 126 may be a dynamic offload NIC such as that pioneered by Alacritech, Inc., which can manage a TCP connection that has been established by first network stack 135 or second network stack 146, as described for example in U.S. Pat. No. 6,434,620, which is incorporated by reference. The NIC 126 may alternatively be a full offload NIC that establishes and maintains TCP connections but is not designed to transfer TCP connections from or to a computer or other device.


The NIC memory 116 contains a first TCB 164 that was established by the first network stack 135 and then acquired by the NIC 126. The NIC memory 116 may also contain many other TCBs, including a second TCB 166 that was established by the second network stack 144 and then acquired by the NIC 126. The first and second TCBs 164 and 166 both correspond to a logical connection between the first application 160 on the first computer 102 and the second application 162 on the second computer 104. TCB 164 and TCB 166 have been identified by the NIC 126 as being reciprocal, either by checking the four-tuples of packets, or TCBs, during connection establishment, TCB offload, or during data transfer, optionally using a hashing mechanism and/or NIC hardware to accelerate the search for reciprocal TCBs.


Once TCB 164 and TCB 166 have been identified by the NIC 126 as being reciprocal, they are coupled together as shown by arrow 177. The coupling 177 may include flagging the TCBs 164 and 166 as tightly coupled and/or linking them together via a pointer, for example. That is, first TCB 164 can point to the location in memory 116 of second TCB 166, and visa versa. Should these TCP connections be flagged as tightly coupled and/or linked together in this fashion, corresponding code is run by the NIC 126 to remove this flag and/or break the link when one or both of these connections is “uploaded” from the NIC 126 to respective computers 102 and/or 104. Such an uploaded TCB may be flagged as having been previously tightly coupled, with an indication to look for the reciprocal TCB to facilitate tight coupling in the future, should it be offloaded again.


The tightly coupled TCBs 164 and 166 can both be referenced within a single function call, providing substantially simultaneous updating of the TCBs. As mentioned above with regard to the hypervisor embodiment, many advantages of tightly coupled TCP can be realized when one of the sides of a tightly coupled logical connection performs an I/O operation such as data transfer from one side to the other. This is shown symbolically in FIG. 2 as transfer 177 of first data 170, which is under control of first application 160, to second data 172, which is under control of second application 162. Although labeled as first data 170 and second data 172 to facilitate understanding, first data may simply be a copy of second data, or vice-versa. Also, the transfer may actually take place via NIC 126, as described below, but from the viewpoint of computers 102 and 104 the effect may be that shown by arrow 177, because all of the data transfer processing can be offloaded to NIC 126 using tightly coupled TCP. The transfer of first data 170 to second data 172, after first TCB 164 and second TCB 166 have been tightly coupled, begins with a request or command to transfer data being communicated from one of the applications 160 or 172 to its respective operating system 133 or 144. The operating system 133 or 144 recognizes that the corresponding TCB 164 or 166 has been offloaded to NIC 126, and so sends a command to the NIC 126 to send the data.


The NIC 126 in turn recognizes that the corresponding TCB 164 or 166 has been flagged as part of a tightly coupled pair of TCBs, and that data transfer can therefore be accelerated. Instead of transferring the data by way of conventional TCP/IP packets, the data may be transferred by tightly coupled TCP using the posted buffer or indication modes described above. In either of these examples, the DMA engine may transfer multi-kilobyte (e.g., 64 KB) blocks of data between memory 114 and memory 124 without processor 112 or processor 122 performing any data copying. The DMA engine 180 performs such data transfers under control of processor 114, which executes specialized instructions for data transfer using tightly coupled TCP. In concert with the specialized instructions for data transfer, the processor executes instructions specialized for tightly coupled TCP to update TCBs 164 and 166. The specialized instructions for updating TCBs 164 and 166 may violate many rules of the TCB protocol, yet provide reliable, error free, ordered delivery of data without congestion, overflow or underflow. Portions of the TCP protocol that are modified or eliminated for data transfer with tightly coupled TCP include segmentation and reassembly, reordering of packets, window control, congestion control, creating and analyzing checksums, and acknowledgement processing. In short, tightly coupled TCP may accelerate data delivery by orders of magnitude while reducing total processing overhead by similar amounts, all without sacrificing any of the other attributes of TCP, such as guaranteed data delivery. Moreover, as noted above, conventional TCP processing can be resumed at any time.



FIG. 3 shows a computer system 200 having a processor 202 and a memory 204, the computer system connected to a network interface 226 by an I/O channel 228 (e.g., PCI, PCI Express, InfiniBand, etc.) with the network interface connected to a network 230. The computer system 200 is running a first operating system 233 and a second operating system 244, the first operating system 233 having a first network stack 235 that includes conventional components such as a first TCP layer and a first IP layer, the second operating system having a second network stack 246 that includes conventional components such as a second TCP layer and a second IP layer. A first application or other process 260 is running above the first operating system 233 and a second application or other process 262 is running above the second operating system 246. A hypervisor 250 is running below the first network stack 235 and second network stack 46, and a device driver 255 is running below or integrated into the hypervisor 250.


NIC 226 includes a NIC processor 222 and NIC memory 224, as well as a DMA engine 280 that can access computer memory 204, for example to transfer a TCB to or from NIC memory 224. The NIC 226 includes a network protocol stack 214 or hardware that can perform network protocol functions, including at least a subset of the TCP protocol, such as handling bulk data transfer for a TCP connection.


The first application 260 and second application 262 wish to communicate with each other and, because they are running above different operating systems of virtual computer system 200, utilize their respective network stacks to facilitate the communication. First network stack 235 establishes a first TCB 264 to define and manage that communication for the first application 260, and second network stack 246 establishes a second TCB 266 to define and manage that communication for the second application 262. Although the TCP protocol and TCP control blocks are discussed in this embodiment, other networking protocols that utilize control blocks to define and manage communications for applications may alternatively be employed. After establishment by the respective network stacks, TCB 264 and TCB 266 are offloaded to hypervisor 250, device driver 255 or another common platform or process for first and second operating systems 233 and 244. TCB 264 and TCB 266 may be identified by hypervisor 250, device driver 255 or another process or entity as being reciprocal, either by checking the four-tuples of packets or TCBs, during connection establishment, TCB offload, or during data transfer, optionally using a hashing mechanism and/or NIC 226 or other hardware to accelerate the search for reciprocal TCBs.


Once the reciprocal relationship of TCBs 264 and 266 has been identified, they can be tightly coupled, as illustrated by arrow 277, by hypervisor 250 or another entity having instructions that can reference TCB 264 and TCB 266 as a related pair of TCBs rather than with conventional instructions for individual TCBs that incorporate no knowledge of the reciprocal relationship between the TCBs. As described above, reciprocal TCBs 264 and 266 may be flagged as tightly coupled and linked together via a pointer, so that first TCB 264 can point to the location in memory of second TCB 266, and visa versa.


With TCBs 264 and 266 flagged as tightly coupled and/or linked together in this fashion, code may be run by the hypervisor 250 or other entity that accesses both TCBs to remove this flag and/or break the link when one or both of these connections is uploaded from the hypervisor or other entity, so that their control is returned to the respective TCP layers that established them. In this sense, uploading the TCBs means that their control is returned to the respective TCP layers that established them. Such an uploaded TCB may be flagged as having been previously tightly coupled, with an indication to look for the reciprocal TCB to facilitate tight coupling in the future, should it be offloaded again. In some cases it may be desirable to send a request from the hypervisor 250 or other entity that can perform tight coupling to a network stack to offload a TCB that is reciprocal to one already controlled by the hypervisor, especially if the TCB controlled by the hypervisor is flagged as having been previously tightly coupled.


Similar code can be provided should one or both of these TCBs be offloaded again, for example from hypervisor 250 to dynamic offload NIC 226. In this case, it may be desirable to offload the TCBs 264 and 266 together from the hypervisor to the NIC, so that the TCBs remain tightly coupled even during the offloading. Reasons for offloading a tightly coupled pair of connections from hypervisor 250 to NIC 226 include utilizing the hardware of NIC 226 rather than processor 202 to transfer data. A set of instructions that accomplishes this tightly coupled offloading may be provided to the hypervisor and the NIC, and a similar but converse set of instructions can be provided to those entities to accomplish a tightly coupled uploading. Because both ends of the logical connection are controlled by one entity, the possibility of in-transit packets that may cause a race condition in offloading a single TCB can be avoided, and a multistep offload need not occur.



FIG. 3 shows a situation in which tightly coupled TCBs 264 and 266 have been offloaded from hypervisor 250 to NIC 226 as a tightly coupled pair, and are maintained in NIC memory 224 as first TCB 264′ and second TCB 266′, which are tightly coupled as shown by arrow 288. Alternatively, first and second TCBs 264′ and 266′ may have been separately acquired by the NIC 226. First and second TCBs 264′ and 266′ may similarly be uploaded to hypervisor 250 as a tightly coupled pair or separately as individual TCBs. Uploading of the TCBs as a tightly-coupled pair may be accomplished in a single step, whereas separate uploading of the TCBs may require a three step process for each of the individual TCBs, to avoid the race condition mentioned above. In the former case the TCBs are flagged as being tightly-coupled and are acted on by instructions specialized for their tightly-coupled status, in the latter case each TCB is flagged as having been previously tightly-coupled and is operated on by instructions for handling individual TCBs.


Although we have focused on detailed descriptions of particular embodiments, other embodiments and modifications are within the spirit of this invention as defined by the appended claims. For example, although TCP is discussed as an exemplary transport level protocol, other control blocks that define logical connections, which may be found in other protocols or in modifications to the TCP protocol, may instead be employed. Moreover, although a virtual computer system is discussed, other systems in which a single entity can access both ends of a logical connection may also be tightly-coupled.

Claims
  • 1. A method comprising: running a first operating system and a second operating system on a computer system, the first operating system having a first network protocol stack including a first transmission control protocol (TCP) layer and the second operating system having a second network protocol stack including a second TCP layer;establishing a logical connection between the first and second TCP layers, including creating, by the first TCP layer, a first TCP control block corresponding to the logical connection, and creating, by the second TCP layer, a second TCP control block corresponding to the logical connection; andcoupling the first TCP control block to the second TCP control block, including executing instructions, by an entity that accesses the first and second TCP control blocks, that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection.
  • 2. The method of claim 1, wherein executing instructions includes referencing the first TCP control block and the second TCP control block within a function call.
  • 3. The method of claim 1, wherein coupling the first TCP control block to the second TCP control block includes associating the first TCP control block with the second TCP control block by a virtual machine manager.
  • 4. The method of claim 1, wherein coupling the first TCP control block to the second TCP control block includes associating the first TCP control block with the second TCP control block by the first operating system.
  • 5. The method of claim 1, wherein coupling the first TCP control block to the second TCP control block includes storing the first TCP control block and the second TCP control block on an interface that is connected to the computer system by an input/output (I/O) channel.
  • 6. The method of claim 1, further comprising recognizing that the first TCP control block and the second TCP control block correspond to the same logical connection.
  • 7. The method of claim 1, further comprising recognizing that the first TCP control block and the second TCP control block are reciprocal to each other.
  • 8. The method of claim 1, further comprising recognizing that the first TCP control block is identified by IP addresses and TCP ports that are a mirror image of the IP addresses and TCP ports of the second TCP control block.
  • 9. The method of claim 1, further comprising flagging the first TCP control block and the second TCP control block as tightly-coupled.
  • 10. The method of claim 1, wherein the first TCP control block points to a location in memory of the second TCP control block, and the second TCP control block points to a location in memory of the first TCP control block.
  • 11. The method of claim 1, further comprising offloading, from the entity to a second entity, the first and second TCPs control blocks as a pair of tightly coupled control blocks.
  • 12. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes transferring data without a corresponding TCP header from a memory source controlled by a process running above the first TCP layer directly to a memory destination controlled by a process running above the second TCP layer.
  • 13. The method of claim 12, wherein transferring the data is performed without the data being copied by a central processing unit (CPU) that runs the first or second operating system, and without the data traversing an I/O channel of the computer.
  • 14. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes transferring, by direct memory access (DMA), data from a memory source controlled by a process running above the first TCP layer to a memory destination controlled by a process running above the second TCP layer.
  • 15. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid transferring window update packets between the first and second TCP layers.
  • 16. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid transferring acknowledgements (ACKs) between the first and second TCP layers.
  • 17. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid segmenting data that is transferred between the first and second TCP layers into packets.
  • 18. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid checksumming data that is transferred between the first and second TCP layers.
  • 19. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid retransmitting data that is transferred between the first and second TCP layers.
  • 20. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid timers.
  • 21. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid round trip time calculations.
  • 22. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes altering the TCP protocol to avoid congestion-control.
  • 23. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes substantially simultaneously referencing the first and second TCP control blocks by the entity.
  • 24. The method of claim 1, wherein executing instructions that are specialized for a situation in which the first and second TCP control blocks correspond to the same logical connection includes substantially simultaneously updating the first and second TCP control blocks by the entity.
  • 25. A method comprising: running a first operating system and a second operating system on a computer system, the first operating system having a first network protocol stack including a first transmission control protocol (TCP) layer and the second operating system having a second network protocol stack including a second TCP layer;establishing a logical connection between the first and second TCP layers, including creating, by the first TCP layer, a first TCP control block corresponding to the logical connection, and creating, by the second TCP layer, a second TCP control block corresponding to the logical connection; andcoupling the first TCP control block to the second TCP control block, including executing a function call that references the first and second TCP control blocks.
  • 26. The method of claim 25, wherein executing a function call that references the first and second TCP control blocks includes substantially simultaneously updating the first and second TCP control blocks by the entity.
  • 27. The method of claim 25, wherein coupling the first TCP control block to the second TCP control block includes associating the first TCP control block with the second TCP control block by a virtual machine manager.
  • 28. The method of claim 25, wherein coupling the first TCP control block to the second TCP control block includes associating the first TCP control block with the second TCP control block by the first operating system.
  • 29. The method of claim 25, wherein coupling the first TCP control block to the second TCP control block includes storing the first TCP control block and the second TCP control block on an interface that is connected to the computer system by an input/output (I/O) channel.
  • 30. The method of claim 25, further comprising recognizing that the first TCP control block and the second TCP control block correspond to the same logical connection.
  • 31. The method of claim 25, further comprising recognizing that the first TCP control block and the second TCP control block are reciprocal to each other.
  • 32. The method of claim 25, further comprising recognizing that the first TCP control block is identified by IP addresses and TCP ports that are a mirror image of the IP addresses and TCP ports of the second TCP control block.
  • 33. The method of claim 25, further comprising flagging the first TCP control block and the second TCP control block as tightly-coupled.
  • 34. The method of claim 25, wherein the first TCP control block points to a location in memory of the second TCP control block, and the second TCP control block points to a location in memory of the first TCP control block.
  • 35. The method of claim 25, further comprising offloading, from the entity to a second entity, the first and second TCPs control blocks as a pair of tightly coupled control blocks.
  • 36. The method of claim 25, wherein executing a function call that references the first and second TCP control blocks includes transferring data without a corresponding TCP header from a memory source controlled by a process running above the first TCP layer directly to a memory destination controlled by a process running above the second TCP layer.
  • 37. The method of claim 36, wherein transferring the data is performed without the data being copied by a central processing unit (CPU) that runs the first or second operating system, and without the data traversing an I/O channel of the computer.
  • 38. The method of claim 25, wherein executing a function call that references the first and second TCP control blocks includes transferring, by direct memory access (DMA), data from a memory source controlled by a process running above the first TCP layer to a memory destination controlled by a process running above the second TCP layer.
  • 39. The method of claim 25, further comprising altering the TCP protocol to avoid transferring window update packets between the first and second TCP layers.
  • 40. The method of claim 25, further comprising altering the TCP protocol to avoid transferring acknowledgements (ACKs) between the first and second TCP layers.
  • 41. The method of claim 25, further comprising altering the TCP protocol to avoid segmenting data that is transferred between the first and second TCP layers into packets.
  • 42. The method of claim 25, further comprising altering the TCP protocol to avoid checksumming data that is transferred between the first and second TCP layers.
  • 43. The method of claim 25, further comprising altering the TCP protocol to avoid retransmitting data that is transferred between the first and second TCP layers.
  • 44. The method of claim 25, further comprising altering the TCP protocol to avoid timers.
  • 45. The method of claim 25, further comprising altering the TCP protocol to avoid round trip time calculations.
  • 46. The method of claim 25, further comprising altering the TCP protocol to avoid congestion-control.
  • 47. The method of claim 25, wherein executing a function call that references the first and second TCP control blocks includes substantially simultaneously referencing the first and second TCP control blocks by the entity.
  • 48. A method comprising: running a first operating system and a second operating system on a computer system, the first operating system having a first network protocol stack including a first transmission control protocol (TCP) layer and the second operating system having a second network protocol stack including a second TCP layer;establishing a logical connection between the first and second TCP layers, including creating, by the first TCP layer, a first TCP control block corresponding to the logical connection, and creating, by the second TCP layer, a second TCP control block corresponding to the logical connection;coupling the first TCP control block to the second TCP control block, including referencing the first TCP control block by a set of instructions and referencing the second TCP control block by the set of instructions.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit under 35 U.S.C. 120 of (is a continuation of) U.S. patent application Ser. No. 12/410,366, filed Mar. 24, 2009, now U.S. Pat. No. 8,539,513 which in turn claims the benefit under 35 U.S.C. 119 of U.S. Provisional Patent Application 61/072,773, filed April, 2008, both of which are incorporated by reference herein.

US Referenced Citations (299)
Number Name Date Kind
4366538 Johnson et al. Dec 1982 A
4485455 Boone et al. Nov 1984 A
4485460 Stambaugh Nov 1984 A
4589063 Shah et al. May 1986 A
4700185 Balph et al. Oct 1987 A
4991133 Davis et al. Feb 1991 A
5056058 Hirata et al. Oct 1991 A
5058110 Beach et al. Oct 1991 A
5097442 Ward et al. Mar 1992 A
5129093 Muramatsu et al. Jul 1992 A
5163131 Rowet et al. Nov 1992 A
5212778 Dally et al. May 1993 A
5274768 Traw et al. Dec 1993 A
5280477 Trapp Jan 1994 A
5281963 Ishikawa et al. Jan 1994 A
5289580 Latif et al. Feb 1994 A
5303344 Yokoyama et al. Apr 1994 A
5412782 Hausman et al. May 1995 A
5418912 Christenson May 1995 A
5448566 Richter et al. Sep 1995 A
5485455 Dobbins et al. Jan 1996 A
5485460 Schrier et al. Jan 1996 A
5485579 Hitz et al. Jan 1996 A
5506966 Ban Apr 1996 A
5511169 Suda Apr 1996 A
5517668 Szwerinski et al. May 1996 A
5524250 Chesson et al. Jun 1996 A
5535375 Eshel et al. Jul 1996 A
5548730 Young et al. Aug 1996 A
5553241 Shirakhar Sep 1996 A
5566170 Bakke et al. Oct 1996 A
5574919 Netravali et al. Nov 1996 A
5588121 Reddin et al. Dec 1996 A
5590328 Seno et al. Dec 1996 A
5592622 Isfeld et al. Jan 1997 A
5596574 Perlman et al. Jan 1997 A
5598410 Stone Jan 1997 A
5619650 Bach et al. Apr 1997 A
5629933 Delp et al. May 1997 A
5633780 Cronin May 1997 A
5634099 Andrews et al. May 1997 A
5634127 Cloud et al. May 1997 A
5642482 Pardillos Jun 1997 A
5664114 Krech, Jr. et al. Sep 1997 A
5671355 Collins Sep 1997 A
5678060 Yokoyama et al. Oct 1997 A
5682534 Kapoor et al. Oct 1997 A
5684954 Kaiserswerth et al. Nov 1997 A
5692130 Shobu et al. Nov 1997 A
5699317 Sartore et al. Dec 1997 A
5699350 Kraslavsky Dec 1997 A
5701434 Nakagawa Dec 1997 A
5701516 Cheng et al. Dec 1997 A
5727142 Chen Mar 1998 A
5742765 Wong et al. Apr 1998 A
5749095 Hagersten May 1998 A
5751715 Chan et al. May 1998 A
5751723 Vanden Heuvel et al. May 1998 A
5752078 Delp et al. May 1998 A
5758084 Silverstein et al. May 1998 A
5758089 Gentry et al. May 1998 A
5758186 Hamilton et al. May 1998 A
5758194 Kuzman May 1998 A
5768618 Erickson et al. Jun 1998 A
5771349 Picazo, Jr. et al. Jun 1998 A
5774660 Brendel et al. Jun 1998 A
5778013 Jedwab Jul 1998 A
5778419 Hansen et al. Jul 1998 A
5790804 Osborne Aug 1998 A
5794061 Hansen et al. Aug 1998 A
5802258 Chen Sep 1998 A
5802580 McAlpine Sep 1998 A
5809328 Nogales et al. Sep 1998 A
5809527 Cooper et al. Sep 1998 A
5812775 Van Seters et al. Sep 1998 A
5815646 Purcell et al. Sep 1998 A
5819111 Davies et al. Oct 1998 A
5828835 Isfeld et al. Oct 1998 A
5848293 Gentry Dec 1998 A
5870394 Oprea Feb 1999 A
5872919 Wakeland Feb 1999 A
5878225 Bilansky et al. Mar 1999 A
5878227 Wade et al. Mar 1999 A
5892903 Klaus Apr 1999 A
5898713 Melzer et al. Apr 1999 A
5913028 Wang et al. Jun 1999 A
5917828 Thompson Jun 1999 A
5920566 Hendel et al. Jul 1999 A
5930830 Mendelson et al. Jul 1999 A
5931918 Row et al. Aug 1999 A
5935205 Murayama et al. Aug 1999 A
5935249 Stern et al. Aug 1999 A
5937169 Connery et al. Aug 1999 A
5941969 Ram et al. Aug 1999 A
5941972 Hoese et al. Aug 1999 A
5950203 Stakuis et al. Sep 1999 A
5963876 Manssen et al. Oct 1999 A
5978844 Tsuchiya et al. Nov 1999 A
5987022 Geiger et al. Nov 1999 A
5991299 Radogna et al. Nov 1999 A
5996013 Delp et al. Nov 1999 A
5996024 Blumenau Nov 1999 A
6005849 Roach et al. Dec 1999 A
6009478 Panner et al. Dec 1999 A
6014380 Hendel et al. Jan 2000 A
6014557 Morton et al. Jan 2000 A
6016513 Lowe Jan 2000 A
6021446 Gentry, Jr. Feb 2000 A
6021507 Chen Feb 2000 A
6026452 Pitts Feb 2000 A
6034963 Minami et al. Mar 2000 A
6038562 Anjur et al. Mar 2000 A
6041058 Flanders et al. Mar 2000 A
6041381 Haese Mar 2000 A
6044438 Olnowich Mar 2000 A
6047323 Krause Apr 2000 A
6047356 Anderson et al. Apr 2000 A
6049528 Hendel et al. Apr 2000 A
6057863 Olarig May 2000 A
6061368 Hitzelberger May 2000 A
6065096 Day et al. May 2000 A
6067569 Khaki et al. May 2000 A
6070200 Gates et al. May 2000 A
6078564 Lakshman et al. Jun 2000 A
6078733 Osborne Jun 2000 A
6097734 Gotesman et al. Aug 2000 A
6101555 Goshey et al. Aug 2000 A
6111673 Chang et al. Aug 2000 A
6115615 Ota et al. Sep 2000 A
6122670 Bennett et al. Sep 2000 A
6141701 Whitney Oct 2000 A
6141705 Anand et al. Oct 2000 A
6145017 Ghaffari Nov 2000 A
6157944 Pederson Dec 2000 A
6157955 Narad et al. Dec 2000 A
6172980 Flanders et al. Jan 2001 B1
6173333 Jolitz et al. Jan 2001 B1
6181705 Branstad et al. Jan 2001 B1
6202105 Gates et al. Mar 2001 B1
6219693 Napolitano et al. Apr 2001 B1
6223242 Sheafor et al. Apr 2001 B1
6226680 Boucher et al. May 2001 B1
6233242 Mayer et al. May 2001 B1
6243667 Kerr et al. Jun 2001 B1
6246683 Connery et al. Jun 2001 B1
6247060 Boucher et al. Jun 2001 B1
6279051 Gates et al. Aug 2001 B1
6289023 Dowling et al. Sep 2001 B1
6298403 Suri et al. Oct 2001 B1
6324649 Eyres et al. Nov 2001 B1
6334153 Boucher et al. Dec 2001 B2
6343345 Hilla et al. Jan 2002 B1
6343360 Feinleib Jan 2002 B1
6345301 Burns et al. Feb 2002 B1
6345302 Bennett et al. Feb 2002 B1
6356951 Gentry, Jr. Mar 2002 B1
6370599 Anand et al. Apr 2002 B1
6385647 Willis et al. May 2002 B1
6389468 Muller et al. May 2002 B1
6389479 Boucher May 2002 B1
6393487 Boucher et al. May 2002 B2
6418169 Datari Jul 2002 B1
6421742 Tillier Jul 2002 B1
6421753 Haese et al. Jul 2002 B1
6427169 Elzur Jul 2002 B1
6427171 Craft et al. Jul 2002 B1
6427173 Boucher et al. Jul 2002 B1
6434620 Boucher et al. Aug 2002 B1
6434651 Gentry, Jr. Aug 2002 B1
6449656 Elzur et al. Sep 2002 B1
6452915 Jorgensen Sep 2002 B1
6453360 Muller et al. Sep 2002 B1
6453406 Sarnikowski et al. Sep 2002 B1
6470415 Starr et al. Oct 2002 B1
6473425 Bellaton et al. Oct 2002 B1
6480489 Muller et al. Nov 2002 B1
6483804 Muller et al. Nov 2002 B1
6487202 Klausmeier et al. Nov 2002 B1
6487654 Dowling Nov 2002 B2
6490631 Teich et al. Dec 2002 B1
6502144 Accarie Dec 2002 B1
6523119 Pavlin et al. Feb 2003 B2
6526446 Yang et al. Feb 2003 B1
6542504 Mahler et al. Apr 2003 B1
6570884 Connery et al. May 2003 B1
6591302 Boucher et al. Jul 2003 B2
6591310 Johnson Jul 2003 B1
6594261 Boura et al. Jul 2003 B1
6631484 Born Oct 2003 B1
6648611 Morse et al. Nov 2003 B2
6650640 Muller et al. Nov 2003 B1
6657757 Chang et al. Dec 2003 B1
6658480 Boucher et al. Dec 2003 B2
6678283 Teplitsky Jan 2004 B1
6681364 Calvignac et al. Jan 2004 B1
6683851 Willkie et al. Jan 2004 B1
6687758 Craft et al. Feb 2004 B2
6697366 Kim Feb 2004 B1
6697868 Craft et al. Feb 2004 B2
6751665 Philbrick et al. Jun 2004 B2
6757731 Barnes et al. Jun 2004 B1
6757746 Boucher et al. Jun 2004 B2
6765901 Johnson et al. Jul 2004 B1
6807581 Starr et al. Oct 2004 B1
6842896 Redding et al. Jan 2005 B1
6862264 Moura et al. Mar 2005 B1
6912522 Edgar Jun 2005 B2
6938092 Burns Aug 2005 B2
6941386 Craft et al. Sep 2005 B2
6965941 Boucher et al. Nov 2005 B2
6976148 Arimilli et al. Dec 2005 B2
6996070 Starr et al. Feb 2006 B2
7016361 Swank et al. Mar 2006 B2
7024460 Koopmas et al. Apr 2006 B2
7042898 Blightman et al. May 2006 B2
7047320 Arimilli et al. May 2006 B2
7073196 Dowd et al. Jul 2006 B1
7076568 Philbrick et al. Jul 2006 B2
7089326 Boucher et al. Aug 2006 B2
7093099 Bodas et al. Aug 2006 B2
7124205 Craft et al. Oct 2006 B2
7133940 Blightman et al. Nov 2006 B2
7167926 Boucher et al. Jan 2007 B1
7167927 Philbrick et al. Jan 2007 B2
7174393 Boucher et al. Feb 2007 B2
7181531 Pinkerton et al. Feb 2007 B2
7185266 Blightman et al. Feb 2007 B2
7187679 Dally et al. Mar 2007 B2
7191241 Boucher et al. Mar 2007 B2
7191318 Tripathy et al. Mar 2007 B2
7237036 Boucher et al. Jun 2007 B2
7254696 Mittal et al. Aug 2007 B2
7260518 Kerr et al. Aug 2007 B2
7284070 Boucher et al. Oct 2007 B2
7287092 Sharp Oct 2007 B2
7337241 Boucher et al. Feb 2008 B2
7461160 Boucher et al. Dec 2008 B2
7472156 Philbrick et al. Dec 2008 B2
7496689 Sharp et al. Feb 2009 B2
7502869 Boucher et al. Mar 2009 B2
7519699 Jain et al. Apr 2009 B2
7543087 Philbrick et al. Jun 2009 B2
7584260 Craft et al. Sep 2009 B2
7620726 Craft et al. Nov 2009 B2
7627001 Craft et al. Dec 2009 B2
7627684 Boucher et al. Dec 2009 B2
7640364 Craft et al. Dec 2009 B2
7664868 Boucher et al. Feb 2010 B2
7664883 Craft et al. Feb 2010 B2
7673072 Boucher et al. Mar 2010 B2
7694024 Philbrick et al. Apr 2010 B2
7738500 Jones et al. Jun 2010 B1
8028071 Mahalingam et al. Sep 2011 B1
20010004354 Jolitz Jun 2001 A1
20010013059 Dawson et al. Aug 2001 A1
20010014892 Gaither et al. Aug 2001 A1
20010014954 Purcell et al. Aug 2001 A1
20010025315 Jolitz Sep 2001 A1
20010037406 Philbrick et al. Nov 2001 A1
20010048681 Bilic et al. Dec 2001 A1
20010053148 Bilic et al. Dec 2001 A1
20020073223 Darnell et al. Jun 2002 A1
20020112175 Makofka et al. Aug 2002 A1
20020156927 Boucher et al. Oct 2002 A1
20030014544 Pettey Jan 2003 A1
20030046330 Hayes Mar 2003 A1
20030066011 Oren Apr 2003 A1
20030067903 Jorgensen Apr 2003 A1
20030110271 Jayam et al. Jun 2003 A1
20030110344 Szczepanek et al. Jun 2003 A1
20030165160 Minami et al. Sep 2003 A1
20040010712 Hui et al. Jan 2004 A1
20040042458 Elzu Mar 2004 A1
20040042464 Elzur et al. Mar 2004 A1
20040049580 Boyd et al. Mar 2004 A1
20040049601 Boyd et al. Mar 2004 A1
20040054814 McDaniel Mar 2004 A1
20040059926 Angelo et al. Mar 2004 A1
20040073703 Boucher et al. Apr 2004 A1
20040088262 Boucher et al. May 2004 A1
20040153578 Elzur Aug 2004 A1
20040210795 Anderson Oct 2004 A1
20040213290 Johnson et al. Oct 2004 A1
20040246974 Gyugyi et al. Dec 2004 A1
20040249957 Ekis et al. Dec 2004 A1
20040267866 Carollo et al. Dec 2004 A1
20050060538 Beverly Mar 2005 A1
20050144300 Craft et al. Jun 2005 A1
20060133386 McCormack et al. Jun 2006 A1
20060248208 Walbeck et al. Nov 2006 A1
20060274762 Pong Dec 2006 A1
20070083682 Bartley et al. Apr 2007 A1
20070140240 Dally et al. Jun 2007 A1
20080043732 Desai et al. Feb 2008 A1
20080170501 Patel et al. Jul 2008 A1
20080184273 Sekar Jul 2008 A1
20080209084 Wang et al. Aug 2008 A1
20080240111 Gadelrab Oct 2008 A1
20090063696 Wang et al. Mar 2009 A1
Foreign Referenced Citations (13)
Number Date Country
WO 9819412 May 1998 WO
WO 9850852 Nov 1998 WO
WO 9904343 Jan 1999 WO
WO 9965219 Dec 1999 WO
WO 0013091 Mar 2000 WO
WO 0104770 Jan 2001 WO
WO 0105107 Jan 2001 WO
WO 0105116 Jan 2001 WO
WO 0105123 Jan 2001 WO
WO 0140960 Jun 2001 WO
WO 0159966 Aug 2001 WO
WO 0186430 Nov 2001 WO
WO 2007130476 Nov 2007 WO
Non-Patent Literature Citations (87)
Entry
“Hardware Assisted Protocol Processing”, (which Eugene Feinber is working on). Downloaded from the internet and printed on Nov. 25, 1998. 1 page.
“Z85C30 CMOS SCC Serial Communication Controller”, Zilog Inc. Zilog product Brief. 1997. 3 pages.
“ISmart LAN Work Requests.” Internet pages of Xpoint Technologies, Inc. printed Dec. 19, 1997. 5 pages.
“Asante and 100BASE-T Fast Ethernet.” Internet pages printed May 27, 1997. 7 pages.
“A Guide to the Paragon XP/S-A7 Supercomputer at Indiana University.” Internet pages printed Dec. 21, 1998. 13 pages.
Stevens, Richard. TCPIIP Illustrated, vol. 1, The Protocols. 1994. pp. 325-326.
“Northridge/Southbridge vs. Intel Hub Architecture .” Internet pages printed Feb. 19, 2001. 4 pages.
“Gigabit Ethernet Technical Brief, Achieving End-to-End Performance” Alteon Networks, Inc., First Edition. Sep. 1996. 15 pages.
“Technical Brief on Alteon Ethernet Gigabit NIC Technology” Internet pages downloaded from www.alteon.com. Printed Mar. 15, 1997. 14 pages.
“VT8501 Apollo MVP4.” VIA Technologies, Inc. Revision 1.3. Feb. 1, 2000. Pages i-iv, 1-11, cover and copyright page.
“iReady Rounding Out Management Team with Two Key Executives.” iReady News Archives article downloaded from http:/fwww.ireadyco.com/archives/keyexec.html. Printed Nov. 28, 1998. 2 pages.
Internet pages from iReady Products, web sitehttp:/fwww.ireadyco.com/products,html. Printed Nov. 25, 1998. 2 pages.
iReady News Archives, Toshiba, iReady shipping Internet chip. Printed Nov. 25, 1998.1 page
“Technology.” Interprophet article downloaded fromhttp:/fwww.interprophet.com/technology.html. Printed Mar. 1, 2000. 17 pages.
“The 1-1000 Internet Tuner.” iReady Corporation article. Date unknown. 2 pages.
“About Us Introduction.” iReady article downloaded from the internet http:f/www.iReadyco.com/about.html. Pritned Nov. 25, 1998. 3 pages.
“Revolutionary Approach to Consumer Electronics Internet Connectivity Funded.” iReady News Archive article. San Jose, California. Nov. 20,1997 (printed Nov. 2, 1998). 2 pages.
“Seiko Instruments Inc. (SII) Introduces World'S First Internet-Ready Intelligent LCD Modules Based on IReady Technology.” iReady News Archive article. Santa Clara, CA and Chiba, Japan. Oct. 26, 1998 (printed Nov. 2, 1998). 2 pages.
“iReady internet Tuner to Web Enable Devices.” NEWSwatch article. Tuesday, Nov. 5, 1996 (printed Nov. 2, 1998). 2 pages.
Lammers, David. “Tuner for Toshiba, Toshiba Taps iReady for Internet Tuner.” EE Times article. Printed Nov. 2, 1998. 2 pages.
Carbone, J.S. “Comparison of Novell Netware and TCP/IP Protocol Architectures.” Printed Apr. 10, 1998. 19 pages.
“AEA-7110C-a DuraSAN product.” Adaptec article. Printed Oct. 1, 2001. 11 pages.
“iSCSI and 2Gigabit fibre Channel Host Bus Adapters from Emulex, Qlogic, Adaptec, JNI.” iSCSI HBA article. Printed Oct. 1, 2001. 8 pages.
“FCE-3210/6410 32 and 64-bit PCI-to-Fibre Channel HBA.” iSCSI HBA article. Printed Oct. 1, 2001. 6 pages.
“iSCSI Storage.” ISCSI.com article. Pritned Oct. 1, 2001. 2 pages.
Kalampoulkas et al. 'Two-Way TCP Traffic Over Rate Controlled Channels: Effects and Analysis. IEEE Transactions on Networking. vol. 6, No. 6. Dec. 1998. 17 pages.
“Toshiba Delivers First Chips to Make Consumer Devices Internet-Ready Based on iReady Design.” IReady News article. Santa Clara, CA, and Tokyo, Japan. Oct. 14, 1998 (printed Nov. 2, 1998). 3 pages.
Jolitz, Lynne. “Frequently Asked Questions.” Internet pages. of InterProphet. Printed Jun. 14, 1999. 4 pages.
Hitz et al. “File System Design for an NFS File Server Appliance.” Winter of 1992. 13 pages.
“Adaptec Announces EtherStorage Technology.” Adaptec Press Release article. May 4, 2000 (printed Jun. 15, 2000). 2 pages.
“EtherStorage Frequently Asked Questions.” Adapted article. Printed Jul. 19, 2000. 5 pages.
“EtherStorage White Paper.” Adapted article. Printed Jul. 19, 2000. 7 pages.
Berlino, J. et al. “Computers; Storage.” CIBC World Markets article. Aug. 7, 2000. 9 pages.
Milunovich, S. Merrill Lynch article entitled “Storage Futures.” Merrill Lynch article. May 10, 2000. 22 pages.
Taylor, S. “Montreal Start-Up Battles Data Storage Botttleneck.” CBS Market Watch article. Mar. 5, 2000 (printed Mar. 7, 2000). 2 pages.
Satran, J. et al. “SCSIITCP (SCSI over TCP).” Internet-draft article. Feb. 2000 (printed May 19, 2000). 38 pages.
'Technical White Paper-Xpoint's Disk to LAN Acceleration Solution for Windows NT Server. Internet pages. Printed Jun. 5, 1997. 15 pages.
“Network Accelerator Chip Architecture, twelve-slide presentation.” Jato Technologies article. Printed Aug. 19, 1998. 13 pages.
“Enterprise System Uses Flexible Spec.” EETimes article. Aug. 10, 1998 (printed Nov. 25, 1998). 3 pages.
“Smart Ethernet Network Interface Cards,” which Berend Ozceri is developing. Internet pages. Printed Nov. 25, 1998. 2 pages.
“GigaPower Protocol Processor Product Review.” Internet pages of Xaqti corporation. Printed Nov. 25, 1999. 4 pages.
Oren, Amit (inventor). Assignee: Siliquent Technologies Ltd. “CRC Calculations for Out of Order PUDs.” U.S. Appl. No. 60/283,896, filed Apr. 12, 2003.
Walsh, Robert J. “Dart: Fast Application Level Networking via Data-Copy Avoidance.” Internet pages. Printed Jun. 3, 1999. 25 pages.
Tanenbaum, Andrews. Computer Networks, Third Edition, ISBN 0-13-349945-6. Mar. 6, 1996.
Druschel, Peter et al. “LRP: A New Network Subsystem Architecture for Server Systems.” Article from Rice University. Oct. 1996. 14 pages.
“TCP Control Block Interdependence.” Internet RFC/STD/FYI/BCP Archives article with heading “RFC2140” web address http://www.faqs.org/rfcs/rfc2140.html. Printed Sep. 2, 2002. 9 pages.
'Tornado: for Intelligent Network Acceleration. WindRiver article. Copyright Wind River Systems. 2001. 2 pages.
“Complete TCP/IP Offload for High-Speed Ethernet Networks.” WindRiver White Paper. Copyright Wind River Systems. 2002. 7 pages.
“Solving Server Bottlenecks with Intel Server Adapters.” Intel article. Copyright Intel Corporation. 1999. 8 pages.
Schwaderer et al. “XTP in VLSI Protocol Decomposition for ASIC Implementation.” IEEE Computer Society Press publication from 15th Conference on Local Computer Networks. Sep. 30-Oct. 3, 1990. 5 pages.
Beach, Bob. “UltraNet: An Architecture for Gigabit Networking.” IEEE Computer Society Press publication from 15th Conference on Local Computer Networks. Sep. 30-Oct. 3, 1990. 18 pages.
Chesson, et al. “The Protocol Engine Chipset.” IEEE Symposium Record from Hot Chips III. Aug. 26-27, 1991. 16 pages.
MacLean et al. “An Outboard Processor for High Performance Implementation of Transport Layer Protocols.” IEEE Global Telecommunications Conference, Globecom '91, presentation. Dec. 2-5, 1991. 7 pages.
Ross et al. “FX1000: A high performance single chip Gigabit Ethernet NIC.” IEEE article from Compean '97 Proceedings. Feb. 23-26, 1997. 7 pages.
Strayer et al. “Ch. 9: the Protocol Engine.” From XTP: The Transfer Protocol. Jul. 1992. 12 pages.
Publication entitled “Protocol Engine Handbook.” Oct. 1990. 44 pages.
Koufopavlou et al. “Parallel TCP for High Performance Communication Subsystems.” IEEE Global Telecommunications Conference, Globecom '92, presentation. Dec. 6-9, 1992. 7 pages.
Lilienkamp et al. “Proposed Host-Front End Protocol.” Dec. 1984. 56 pages.
Thia et al. “High-Speed OSI Protocol Bypass Algorithm with Window Flow Control.” Protocols for High Speed Networks. 1993. pp. 53-68.
Jolitz, William et al. “TCP/IP Network Accelerator and Method of Use.” Filed Jul. 17, 1997. U.S. Appl. No. 60/053,240.
Thia et al. “A Reduced Operational Protocol Engine (ROPE) for a multiple-layer bypass architecture.” Protocols for High Speed Networks. 1995. pp. 224-239.
Form 10-K for Exelan, Inc., for the fiscal year ending Dec. 31, 1987. 10 pages.
Form 10-K for Exelan, Inc., for the fiscal year ending Dec. 31, 1988. 10 pages.
Merritt, Rick. “Ethernet Interconnect Outpacing Infiniband at Intel.” EE Times article. Sep. 11, 2002. 9 pages.
Starr, David D. et al. “Intelligent Network Storage Interface Device.” U.S. Appl. No. 09/675,700, filed Sep. 29, 2000.
Boucher, Laurence B. et al. “Intelligent Network Interface System and Method for Accelerated Protocol Processing.” U.S. Appl. No. 09/692,561, filed Oct. 18, 2000.
Craft, Peter K. et al. “Transferring Control of TCP Connections Between Hierarchy of Processing Mechanisms.” U.S. Appl. No. 11/249,006, filed Oct. 11, 2005.
Starr, Daryl D. et al. “Accelerating Data Transfer in a Virtual Computer System with Tightly Coupled TCP Connections.” U.S. Appl. No. 12/410,366, filed Mar. 24, 2009.
Craft, Peter K. et al. “TCP Offload Send Optimization.” U.S. Appl. No. 12/504,021, filed Jul. 16, 2009.
Craft, Peter K. et al. “TCP Offload Device that Batches Session Layer Headers to Reduce Interrupts as Well as CPU Copies.” U.S. Appl. No. 12/581,342, filed Oct. 19, 2009.
Philbrick, Clive M. et al. “Freeing Transmit Memory on a Network Interface Device Prior to Receiving an Acknowledgment That Transmit Data Has Been Received by a Remote Device.” U.S. Appl. No. 12/470,980, filed May 22, 2009.
Boucher, Laurence B. et al. “Obtaining a Destination Address so That a Network Interface Device Can Write Network Data Without Headers Directly Into Host Memory.” U.S. Appl. No. 12/325,941, filed Dec. 1, 2008.
Boucher, Laurence B. et al. “Enabling an Enhanced Function of an Electronic Device.” U.S. Appl. No. 11/985,948, filed Nov. 19, 2007.
Starr, Daryl D. et al. “Network Interface Device With 10 Gb/s Full-Duplex Transfer Rate.” U.S. Appl. No. 11/799,720, filed May 1, 2007.
Craft, Peter K. et al. “Peripheral Device That DMAS the Same Data to Different Locations in a Computer.” U.S. Appl. No. 11/788,719, filed Apr. 19, 2007.
Boucher, Laurence B. et al. “TCP/IP Offload Network Interface Device.” U.S. Appl. No. 11/701,705, filed Feb. 2, 2007.
Starr, Daryl D. et al. “TCP/IP Offload Device With Reduced Sequential Processing.” U.S. Appl. No. 11/348,810, filed Feb. 6, 2006.
Boucher, Laurence B. et al. “Network Interface Device That Can Transfer Control of a TCP Connection to a Host CPU.” U.S. Appl. No. 11/029,863, filed Jan. 4, 2005.
Craft, Peter K. et al. “Protocol Stack That Offloads a TCP Connection From a Host Computer to a Network Interface Device.” U.S. Appl. No. 11/027,842, filed Dec. 30, 2004.
Craft, Peter K. et al. “Protocol Stack That Offloads a TCP Connection From a Host Computer to a Network Interface Device.” U.S. Appl. No. 11/016,642, filed Dec. 16, 2004.
Boucher, Laurence B. et al. “Method and Apparatus for Dynamic Packet Batching With a High Performance Network Interface.” U.S. Appl. No. 10/678,336, filed Oct. 3, 2003.
Philbrick Clive M. et al. “Method and Apparatus for Data Re-Assembly With a High Performance Network Interface.” U.S. Appl. No. 10/634,062, filed Aug. 4, 2003.
Boucher, Laurence B. et al. “High Network Interface Device and System for Accelerated Communication.” U.S. Appl. No. 10/601,237, filed Jun. 19, 2003.
Boucher, Laurence B. et al. “Method and Apparatus for Distributing Network Traffic Processing on a Multiprocessor Computer.” U.S. Appl. No. 10/438,719, filed May 14, 2003.
Boucher, Laurence B. et al. “Parsing a Packet Header.” U.S. Appl. No. 10/277,604, filed Oct. 18, 2002.
Starr, Daryl D.. et al. “Intelligient Network Storage Interface System.” U.S. Appl. No. 10/261,051, filed Sep. 30, 2002.
Chandranmenon, Girish P. et al. “Trading Packet Headers for Packet Processing.” IEEE/ACM Transactions on Networking. vol. 4, No. 2. Apr. 1996. pp. 141-152.
Provisional Applications (1)
Number Date Country
61072773 Apr 2008 US
Continuations (1)
Number Date Country
Parent 12410366 Mar 2009 US
Child 14022101 US