System and method for using metadata in the context of a transport offload engine

Information

  • Patent Grant
  • 8065439
  • Patent Number
    8,065,439
  • Date Filed
    Friday, December 19, 2003
    22 years ago
  • Date Issued
    Tuesday, November 22, 2011
    14 years ago
Abstract
A system, method, and related data structure are provided for transmitting data in a network. Included is a data object (i.e. metadata) for communicating between a first network protocol layer and a second network protocol layer. In use, the data object facilitates network communication management utilizing a transport offload engine.
Description
FIELD OF THE INVENTION

The present invention relates to transport offload engines, and more particularly to managing network communications utilizing transport offload engines.


BACKGROUND OF THE INVENTION

Transport offload engines (TOE) include technology that is gaining popularity in high-speed systems for the purpose of optimizing throughput, and lowering processor utilization. TOE components are often incorporated into one of various printed circuit boards, such as a network interface card (NIC), a host bus adapter (HBA), a motherboard; or in any other desired offloading context.


In recent years, the communication speed in systems has increased faster than processor speed. This has produced an input/output (I/O) bottleneck. The processor, which is designed primarily for computing and not for I/O, cannot typically keep up with the data units flowing through the network. As a result, the data flow is processed at a rate slower than the speed of the network. TOE technology solves this problem by removing the burden (i.e. offloading) from the processor and/or I/O subsystem.


Prior art FIG. 1 illustrates a system 100 including both a host processor 102 and a transport offload engine 104, in accordance with the prior art. In use, the processor 102 generates data lists 106 [i.e. scatter-gather lists (SGLs), etc.] for identifying a location in memory 110 where data resides which is to be communicated via a network 116. As shown, the data lists 106 include an address where the data may be found, as well as an associated length.


In use, the processor 102 transmits the data lists 106 to the transport offload engine 104. Armed with such data lists 106, the transport offload engine 104 retrieves the data from the memory 110 and stores the same in a buffer 112, where the data waits to be communicated via the network 116.


To track the various network connections or sockets over which the data is communicated via the network 116, the transport offload engine 104 further employs control blocks 114, which may each include various information associated with a particular network connection or socket.


Thus, to receive a large amount of data via the network 116, the memory required to store data lists 106 and control blocks 114 as well as the buffer 112 may become excessively large. Unfortunately, a large memory can not be implemented in a cost-effective manner on an integrated-circuit transport offload engine 104, since integrating on-board memory on the transport offload engine 104 is costly in terms of silicon die area, for example.


There is thus a need for a more cost effective-technique for transmitting data in a network using data lists (SGLs, etc.).


SUMMARY OF THE INVENTION

A system, method and related data structure are provided for communicating data in a network. Included is a data object (i.e. metadata) for communicating between a first network protocol layer and a second network protocol layer. In use, the data object facilitates network communication management utilizing a transport offload engine.


In one embodiment, the first network protocol layer may include a transport protocol layer. Moreover, the second network protocol layer may include a layer above the transport protocol layer. Optionally, the second network protocol layer may include an application, for example small computer system interface (SCSI) protocol, an Internet small computer system interface (iSCSI) protocol, etc. Of course, the data object may be used in the context of any desired network protocol layer(s).


In another embodiment, the data object may be communicated between a processor and the transport offload engine. Furthermore, the data object may be stored with a data list [i.e. a scatter-gather list (SGL), a memory-descriptor list (MDL), etc.]. Such data list may include an address in memory where data to be communicated is stored, along with any other desired information to facilitate the transmission of the data in a network.


As a further option, the processor may communicate an instruction message to the transport offload engine (TOE) identifying a location in memory where the data list and the data object are stored. Alternatively, the processor may communicate to the TOE a count of how many WORDs have been added to the data list for processing. Still yet, an indicator (i.e. a bit, etc.) may be used to distinguish between the data list elements and the data object elements.


Generally speaking, the data object may be used to communicate state information associated with the second network protocol layer to the first network protocol layer, where the first network protocol layer resides below the second network protocol layer. Further, the data object may be used to communicate (i.e. feedback, etc.) state information associated with the first network protocol layer to the second network protocol layer.


In one embodiment, the data object may include a byte indicator for indicating a number of bytes until a subsequent protocol data unit (PDU). By this feature, markers may be inserted into a data stream in which data is communicated utilizing the transport offload engine for communicating the number of bytes until the subsequent PDU, and/or for communicating an occurrence of a previous PDU.


In another aspect of the present embodiment, the data object may include a start indicator, where the start indicator is adapted for indicating a start of a PDU. Still yet, the data object may include an end indicator for indicating an end of a PDU.


In still another embodiment, the data object may include a transmission control protocol urgent (TCP URG) indicator. In use, such TCP URG indicator may be adapted for indicating a number of bytes until a TCP URG section is complete.


In still another particular embodiment, the data object may include a cyclic redundancy check (CRC) indicator, or another integrity indicator. In use, the CRC indicator may be adapted for clearing a CRC of a socket to zero. Moreover, the CRC indicator may prompt calculation of a CRC, and transmission of the CRC with data communicated in a network. Still yet, the CRC indicator may prompt transmission of a status message to a processor that includes the CRC, where the CRC is stored by the processor in response to the status message for being used during a retransmission request (thus avoiding the need for recalculation).


Thus, the transport offload engine may utilize the data object to process data associated with an upper network protocol layer. Such processed data may then be inserted into a data stream in which the data is communicated utilizing the transport offload engine. The processed data may further be fed back to the processor for use during retransmission. To this end, processing may be offloaded from a processor to the transport offload engine. Moreover, the data may optionally be transmitted between the processor and the transport offload engine only once to conserve resources.





BRIEF DESCRIPTION OF THE DRAWINGS

Prior art FIG. 1 illustrates a system including both a processor and a transport offload engine, in accordance with the prior art.



FIG. 2 illustrates a network system, in accordance with one embodiment.



FIG. 3 illustrates an exemplary architecture in which one embodiment may be implemented.



FIG. 3A illustrates a data object for facilitating network communication management in the context of a transport offload engine, in accordance with one embodiment.



FIG. 4A illustrates an exemplary method for communicating data in a network, in accordance with one embodiment.



FIG. 4B illustrates a continuation of the method of FIG. 4A.



FIG. 5 illustrates exemplary queues and a control block system for communicating data in a network, in accordance with one embodiment.



FIG. 6 illustrates a more detailed view of the contents of the queues of the system of FIG. 5, in accordance with one embodiment.



FIG. 7 illustrates an exemplary configuration of the contents of the queue of FIG. 6, in accordance with one embodiment.



FIG. 8 illustrates an exemplary data stream including markers, in accordance with one embodiment.





DETAILED DESCRIPTION


FIG. 2 illustrates a network system 200, in accordance with one embodiment. As shown, a network 202 is provided. In the context of the present network system 200, the network 202 may take any form including, but not limited to a local area network (LAN), a wide area network (WAN) such as the Internet, etc.


Coupled to the network 202 are a local host 204 and a remote host 206 which are capable of communicating over the network 202. In the context of the present description, such hosts 204, 206 may include a web server, storage server or device, desktop computer, lap-top computer, hand-held computer, printer or any other type of hardware/software. It should be noted that each of the foregoing components as well as any other unillustrated devices may be interconnected by way of one or more networks.



FIG. 3 illustrates an exemplary architecture 300 in which one embodiment may be implemented. In one embodiment, the architecture 300 may represent one of the hosts 204, 206 of FIG. 2. Of course, however, it should be noted that the architecture 300 may be implemented in any desired context.


For example, the architecture 300 may be implemented in the context of a general computer system, a circuit board system, a game console system dedicated for entertainment purposes, a set-top box, a router, a network system, a storage system, an application-specific system, or any other desired system associated with the network 202.


As shown, the architecture 300 includes a plurality of components coupled via a bus 302. Included is at least one processor 304 for processing data. While the processor 304 may take any form, it may, in one embodiment, take the form of a central processing unit (CPU), a chipset (i.e. a group of integrated circuits designed to work and sold as a unit for performing related functions, etc.), or any other desired processing device(s) capable of processing data.


Further included is processor system memory 306 (e.g. a tangible computer readable medium, etc.) which resides in communication with the processor 304 for storing the data. Such processor system memory 306 may take the form of on or off-board random access memory (RAM), a hard disk drive, a removable storage drive (i.e., a floppy disk drive, a magnetic tape drive, a compact disk drive, etc.), and/or any other type of desired memory capable of storing the data.


In use, programs, or control logic algorithms, may optionally be stored in the processor system memory 306. Such programs, when executed, enable the architecture 300 to perform various functions. Of course, the architecture 300 may simply be hardwired.


Further shown is a transport offload engine 312 in communication with the processor 304 and the network (see, for example, network 202 of FIG. 2). In one embodiment, the transport offload engine 312 may remain in communication with the processor 304 via the bus 302. Of course, however, the transport offload engine 312 may remain in communication with the processor 304 via any mechanism that provides communication therebetween. The transport offload engine 312 may include a transport (i.e. TCP/IP) offload engine (TOE), or any integrated circuit(s) that is capable of managing the data communicated in the network.


During operation, in order to provide a cost-effective technique for communicating data in the network, the transport offload engine 312 employs a data object for communicating between a first network protocol layer and a second network protocol layer. More exemplary information regarding one illustrative embodiment of such data object will now be set forth.



FIG. 3A illustrates a data object 320 for facilitating network communication management in the context of a transport offload engine, in accordance with one embodiment. As shown, the data object 320 is for communicating above a first network protocol layer and below a second network protocol layer.


In one embodiment, the first network protocol layer may include a transport protocol layer. Moreover, the second network protocol layer may include any layer above the transport protocol layer. Optionally, the second network protocol layer may, for example, include a small computer system interface (SCSI) protocol, an Internet small computer system interface (iSCSI) protocol; a remote direct memory access (RDMA) protocol, a direct data placement (DDP) protocol, a markers with protocol data unit (PDU) alignment (MPA) protocol, a network file system (NFS) protocol, etc. It should be noted, however, that the data object may be positioned between any desired network protocol layers including, but certainly not limited to SCSI, iSCSI, RDMA, DDP, TCP, IP, etc.


In use, the data object serves to facilitate network communication management utilizing the transport offload engine 312. While such facilitation may take any form that improves operation (i.e. requires less, if any, memory size or utilization on the transport offload engine 312, etc.), more information will now be set forth regarding an optional, illustrative method by which the transport offload engine 312 utilizes the data object.



FIGS. 4A and 4B illustrate an exemplary method 400 for communicating data in a network, in accordance with one embodiment. As an option, the method 400 may be carried out in the context of the exemplary architecture 300 of FIG. 3. Of course, however, it should be noted that the method 400 may be implemented in any desired context. Moreover, while various functions may be attributed to exemplary components (i.e. like those set forth hereinabove), it is important to understand that the various functionality may be carried out by any desired entity.


As shown, in operation 402, information associated with data to be communicated in a network (see, for example, network 202 of FIG. 2) is written to a queue, utilizing a processor (see, for example, processor 304 of FIG. 3). Such queue may be stored in any desired memory (see, for example, memory 306 of FIG. 3; memory associated with transport offload engine 312; etc.). Moreover, the aforementioned information may include a data list [i.e. scatter-gather list (SGL), memory descriptor list (MDL), etc.] which may include at least one data object. As an option, the data object may take the form of metadata.


As a further option, a plurality of the queues may be provided, one for each network socket, or connection. Moreover, a control block may be provided to track the transmission of data via the various sockets. More exemplary information regarding such queues and control blocks will be provided during reference to FIG. 5.


In the context of the present description, the aforementioned data list may include at least one address in memory where data to be communicated is stored, a length of the data, and/or any other desired information to facilitate the retrieval, management, etc. of data for the communication thereof in the network. Still yet, the data object may include any information capable of facilitating network communication management utilizing the transport offload engine.


For example, the data object may include a byte indicator for indicating a number of bytes until a subsequent protocol data unit (PDU), if marking is to be supported. By this feature, markers may be inserted into a data stream in which data is communicated utilizing the transport offload engine for communicating the number of bytes until the subsequent PDU and/or for communicating an occurrence of a previous PDU, in the manner that will be set forth hereinafter in greater detail.


Still yet, as an option, the data object may include a cyclic redundancy check (CRC) indicator (or any another integrity indicator), when CRC is desired. In the context of the present description, CRC involves a technique of checking for errors in data that has been communicated in a network. To accomplish this check, the transport offload engine typically applies a 16- or 32-bit polynomial to a block of the data that is to be communicated and appends the resulting calculated (CRC) to the block. The receiving transport offload engine/host then applies the same polynomial to the data and compares the result with the result appended by the sending transport offload engine. If they match, the data has been received successfully. If not, the sending transport offload engine can be notified to resend the block of data.


As mentioned earlier, other forms of data integrity checks (also known as digests) may be used. Further, checks may cover both the header portions as well as data portions separately (i.e. one check for one portion of a PDU, another check for a different portion of a PDU, etc.).


Thus, in use, the CRC indicator may be adapted for clearing a CRC of a socket in the control block to zero, as well as prompt the various foregoing operations. Still yet, the CRC indicator may prompt the storage of the CRC at a location in memory indicated by the CRC indicator, to avoid the need for recalculation during retransmission, in the manner that will soon be set forth.


Thus, in more general terms, the data object may be used to communicate state information associated with the second network protocol layer to the first network protocol layer, where the first network protocol layer resides below the second network protocol layer. Further, the data object may be used to communicate (i.e. feedback, etc.) state information associated with the first network protocol layer to the second network protocol layer. More information regarding one exemplary embodiment of a data list/data object, and the manner in which such entities are stored in the queues will be set forth during reference to FIGS. 6 and 7.


To distinguish between the data list elements and data object elements in the queues, an indicator may be provided for determining whether the information is to be processed as a data list element or a data object element. Such indicator may take any form including, but not limited to a bit, etc.


Subsequently, in operation 404, an instruction message is communicated from the processor to the transport offload engine to provide the information necessary to access the information (i.e. data lists, data objects, etc.) queued in operation 402. In one embodiment, the instruction message may take the form of an instruction block (IB) that may include any desired information necessary to allow the transport offload engine to retrieve the data to be communicated. For example, the IB may indicate the number of data elements to be communicated, etc.


Equipped with the instruction message of operation 404, the transport offload engine may subsequently access the information stored by the processor. Note operation 406. As an option, direct memory access (DMA) operations may be used to access the information. Moreover, the information may be maintained by the processor until receipt of the transmitted data has been acknowledged. Once acknowledged, a status message may be transmitted from the transport offload engine to the processor for indicating which information may be disposed of, or overwritten. As an option, such status message may include a number of bytes that may be released.


With the information accessed, each data list may be processed by accessing the data to be transmitted and segmenting the data, as indicated in operation 407. Such segmented data may then be transmitted in the network. Any different/supplemental processing may be based on the content, if any, of the data object.


For example, if the data object includes the aforementioned byte indicator (see decision 408) and marking is desired, markers may be generated and transmitted with the data in the network. Note operation 410. Such markers may be utilized to facilitate network communication management by informing receiving hosts as to when a subsequent PDU can be expected, and/or an occurrence of a previous PDU. More exemplary information regarding such markers will be set forth in greater detail during reference to FIG. 8.


With reference now to FIG. 4B, if the data object includes the foregoing CRC indicator (see decision 412), various operations may take place. For example, a CRC may be calculated by applying the polynomial in the manner set forth hereinabove. Note operation 414. Such CRC may then be transmitted with the data in the network, as indicated in operation 416.


Still yet, the CRC indicator may prompt transmission of a status message to a processor that includes the CRC, where the CRC is stored by the processor in response to the status message. See operations 418-420. To this end, the stored CRC may be used during a retransmission request without recalculation, thus facilitating network communication management.


Thus, the transport offload engine may utilize the data object to process (i.e. calculate, etc.) data associated with an upper one of the network protocol layers. Such processed data may then be inserted into a data stream in which the data is communicated utilizing the transport offload engine. The processed data may further be fed back to a processor for retransmission purposes. To this end, processing may be offloaded from a processor. In other words, processing may be split between a software driver and the transport offload engine. Moreover, the data may optionally be transmitted between the processor and the transport offload engine only once, thus optionally freeing up resources on the transport offload engine.


Specifically, upon the transport offload engine detecting that a retransmission is required, the transport offload engine informs the processor with a retransmission status message, or retransmission requested status block (RRSB). The retransmission status message may contain a sequence number of the requested retransmission and length of the data requested. If the length of the data requested is zero, it may be assumed that the transport offload engine does not know how much data has been lost. In most cases, if a selective acknowledgement SACK feature is enabled, the transport offload engine may know how much data is to be retransmitted.


When the host processor receives the retransmission status message, a retransmission instruction message, or retransmission instruction block (IB), is generated. The retransmission instruction message may contain the sequence number that was passed by the retransmission status message and a series of data list (i.e. SGL) elements that are copied from a queue that is associated with the specified socket. These data list elements may contain the same data object(s) (i.e. metadata, etc.) as placed in the queue beforehand.


If markers are used with the socket connection, the first element in the data list may be a byte indicator data object specifying the number of bytes until the next PDU. The data list elements that are CRC data list entries may have CRC flags masked off as the transport offload engine need not necessarily know about CRCs during retransmission and may send the previously stored value. Alternatively, the transmit logic in the offload engine may ignore the CRC flags when encountered during retransmissions.


On reception of the retransmission instruction message, the transport offload engine may transmit a segment or a series of segments to service the instruction message. No status message need necessarily be generated when this retransmission instruction message has been completed. Freeing up of the elements on the queue may be handled when ACKs that come in from the TCP peer host acknowledges the data.


Between the time that the retransmission status message was transmitted to the host and the retransmission instruction message was sent to the transport offload engine, an ACK in the TCP stream may have acknowledged the data that was requested. The transport offload engine may check for this case by comparing the sequence number included in the retransmission instruction message with a received ACK number stored in a control block associated with the socket connection for the retransmitted data. The results of the comparison are used to determine if the data should still be retransmitted.



FIG. 5 illustrates exemplary queues and a control block system 500 for transmitting data in a network, in accordance with one embodiment. As an option, the system 500 may be used in the context of the disclosure of the previous figures. Of course, however, it should be noted that the system 500 may be implemented in any desired context. Most importantly, the exemplary system 500 is set forth for illustrative purposes only, and should not be considered as limiting in any manner.


As shown, a plurality of queues 502 is provided on a processor (see, for example, processor 304 of FIG. 3), each for a separate socket, or network connection. Further provided is a control block 504 on a transport offload engine (see, for example, transport offload engine 312 of FIG. 3) adapted for tracking and managing the communication of data in a network, using the sockets. To accomplish this, the control block 504 receives information via an instruction message that is transmitted from the processor to the transport offload engine to provide the instrumentality necessary to access the queues 502 and the related data lists, data objects, etc. queued in operation 402 of FIG. 4.


Specifically, the control block 504 is capable of indicating a start of a queue 502, an end of the queue 502, and a next read pointer for indicating a next element of queued information to read. See 506. To further facilitate tracking information to be processed, a number of words pending to be read 508 may be tracked utilizing the control block 504. As an option, the words pending to be read 508 may be incremented upon receipt of an instruction block from the processor.



FIG. 6 illustrates a more detailed design 600 of the contents of the queues 502 of the system 500 of FIG. 5, in accordance with one embodiment. Again, the design 600 may be used in the context of the disclosure of the previous figures, or implemented in any desired context.


As shown, each data list 601 may be equipped with an address 614 pointing to a location in memory where the data to be transmitted is stored. Further provided is a length 612 associated with the data. Still yet, a flag field 610 is provided to identify the type of data list entry. As an example, at least one flag is included indicating whether the data is to include a cyclic redundancy check (CRC). As an option, various other flags may be provided, as desired.


In use, each data list 601 (i.e. SGL, etc.) may correspond with a portion of a PDU 607, or marker. For example, each data list 601 may point to a location in memory where data 602 or a header 606 of the PDU 607 is stored. Of course, multiple data lists 601 may correspond to a single portion (i.e. data 602, etc.) of the PDU 607. If multiple data lists 601 are to correspond to a single header 604 of the PDU 607, a CRC 608 may be positioned between the data 602 and header 606 portions of the PDU 607, in the following manner.


Interleaved among the data 602 and header 606 portions of the PDUs 607 may be CRCs 608. Such CRCs 608 may include a data object (see, for example, CRC indicator 615). Moreover, in the case where the CRC 608 precedes a header 606, the CRCs 608 may a data list 601 associated therewith for pointing to the appropriate location in memory where the calculated CRC 608 may be stored. In one embodiment, the data list 601 associated with a CRC 608 may point to a 4-byte data segment and include a dump CRC bit. Of course, the CRC 608 may follow the data 602 and header 606 portions of the PDUs 607, or be positioned in any desired manner.


It should be further noted that the CRCs 608 may be positioned adjacent to the data 602 and/or header 606 portions of the PDUs 607 to provide cyclic redundancy checks associated therewith. As mentioned earlier, the stored CRCs 608 may be used during a retransmission request without recalculation. Thus, since the CRC 608 does not need to be recalculated in such a situation, cyclic redundancy checks may require less processing during retransmission, as set forth earlier during reference to FIGS. 4A and 4B.


Still yet, a byte indicator 613 may precede the PDUs 607 in order to indicate a number of bytes until a subsequent PDU (and/or to indicate an occurrence of a previous PDU) for marking purposes. Given such byte indicator 613 along with a current TCP sequence number, the transport offload engine may set a next PDU pointer to point to a subsequent PDU. More information regarding the manner in which markers are inserted within a data stream will be set forth in greater detail during reference to FIG. 8.


As an option, the data object may further include a start indicator (not specifically shown), where the start indicator is adapted for indicating a start of a PDU. Still yet, the data object may include an end indicator (not specifically shown) for indicating an end of a PDU. This may allow the queuing of many PDUs to be processed when the start and end of such PDUs need to be determined in an arbitrary list of data pointers.


In still another embodiment, the data object may include a transmission control protocol urgent (TCP URG) indicator (not specifically shown). In use, such TCP URG indicator may be adapted for indicating a number of bytes until a TCP URG section is complete. By this feature, TCP URG sections are not lost on retransmit, as happens on many modern day TCP stacks.


As further shown in FIG. 6, pointers may be used to track the status of processing of the queue 502 of data lists 601 and data objects 613, 615. For example, in addition to the start, end and next pointer mentioned in FIG. 5, an acknowledgment (ACK) pointer may be included for identifying which portion of the queue no longer needs to be stored (since the corresponding data has been successfully sent), as indicated by the transport offload engine status message mentioned earlier.


In an embodiment specific to an implementation in the context of an iSCSI protocol, a PDU pointer may also be included to indicate a beginning of a PDU. Such PDU pointer may be updated once the aforementioned ACK pointer moves onto a subsequent PDU.



FIG. 7 illustrates a more detailed, exemplary design 700 of the contents of the queue of FIG. 6, in accordance with one embodiment. Again, the design 700 may be used in the context of the disclosure of the previous figures, or implemented in any desired context.


As shown, the portion of each data list 601 including the flags 610 and length 612 of FIG. 6 may take various forms. A first bit 702 indicates whether the associated information should be processed as a data list or a data object (i.e. metadata, etc.). Moreover, in the case where the first bit 702 indicates that the information should be processed as a data list, a second bit 706 operates as a CRC indicator for indicating that the particular data list should not be processed normally, but instead used to clear a CRC of a socket to zero and store a calculated CRC, as set forth hereinabove. Further included are length bits 708, as well as available bits for additional design options.


On the other hand, in the case where the first bit 702 indicates that the information should be processed as a data object, various operation code 710 and further opcode dependent data 712 are provided for facilitating network communication management.



FIG. 8 illustrates an exemplary data stream 800 including the markers mentioned during reference to FIGS. 4A and 4B, in accordance with one embodiment. Again, the exemplary data stream 800 may be used in the context of the disclosure of the previous figures, or implemented in any desired context.


As shown, markers 806 may be inserted by the transport offload engine (see, for example, transport offload engine 312, of FIG. 3) in the context of the transport network protocol layer. Such markers 806 may be used by a receiving remote host (see, for example, remote host 206 of FIG. 2) for identifying the start (location of headers 804) of PDUs in the data stream 800. To accomplish this, the markers 806 may be inserted in the data stream 800 at fixed intervals.


To support markers 806 at a receiving host, such host may be programmed with two values, a mask and offset. As an option, the markers may be supported for intervals that are powers of two. The host may know what the starting sequence number was at the beginning of the connection and when marker support needs to be turned on. The transport offload engine may be supplied with the mask, where the mask is the marker interval. For example, a mask of 0xff may be a marker interval of 256 bytes and a mask of 0x7ff may be a marker interval of 2K bytes. The offset may be used to synchronize the marker interval with the starting sequence number of the stream. To calculate this offset, a host driver may take the starting sequence number of the connection and mask it with the interval mask.


An example may be an interval of 2K and a starting sequence number of 0xff434155. The driver may program a RECV_MARKER_MASK to 0x7ff and a RECV_MARKER_OFFSET to 0x155.


The transport offload engine may further calculate the number of bytes until the next marker. This may be accomplished by masking off the current sequence number, subtracting the offset, and taking the absolute value. A marker engine may also be turned on and incoming segments may be determined if a marker was contained in the segment. If so, the marker may be extracted and used to update the next PDU sequence number in the socket control block.


While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of a preferred embodiment should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims
  • 1. A computer program product stored in a non-transitory computer readable medium, comprising: computer code including a data object for communicating between a first network protocol layer and a second network protocol layer;wherein the data object facilitates network communication management utilizing a transport offload engine;wherein the data object is stored with a data list in at least one queue;wherein an indicator distinguishes between the data list and the data object in the at least one queue;wherein information associated with data to be transmitted in a network is written to the at least one queue, wherein the information in the at least one queue includes a bit indicating whether the information is to be processed as the data list or data object, each data list having an address in memory where the data to be transmitted is stored, a length associated with the data, and a flag indicating whether the data is to include a cyclic redundancy check (CRC), the data object including a byte indicator indicating a number of bytes until a subsequent protocol data unit (PDU) or a CRC indicator used to clear a CRC of a socket to zero;wherein an instruction message transmitted from a processor to the transport offload engine indicates a start of the at least one queue, an end of the at least one queue, and a next read pointer;wherein the information is accessed from the processor by the transport offload engine utilizing the instruction message via direct memory access (DMA) operations;wherein each data list is processed by the transport offload engine by accessing the data to be transmitted, segmenting the data, and transmitting the data in the network;wherein the data object includes the CRC indicator, and wherein the transport offload engine generates a calculated CRC, transmitting the CRC with the data in the network, and transmitting to the processor a status message that includes the CRC;wherein the CRC is stored in response to the status message and used during a retransmission request without recalculation.
  • 2. The computer program product as recited in claim 1, wherein the first network protocol layer includes a transport protocol layer.
  • 3. The computer program product as recited in claim 2, wherein the second network protocol layer includes a layer above the transport protocol layer.
  • 4. The computer program product as recited in claim 3, wherein the second network protocol layer includes a small computer system interface (SCSI) protocol.
  • 5. The computer program product as recited in claim 3, wherein the second network protocol layer is selected from the group consisting of an Internet small computer system interface (iSCSI) protocol, a remote direct memory access (RDMA) protocol, a direct data placement (DDP) protocol, a markers with protocol data unit (PDU) alignment (MPA) protocol, and a network file system (NFS) protocol.
  • 6. The computer program product as recited in claim 1, wherein the data object includes the metadata.
  • 7. The computer program product as recited in claim 1, wherein the data object is communicated between the processor and the transport offload engine.
  • 8. The computer program product as recited in claim 1, wherein the data list includes at least one of a scatter-gather list (SGL) and a memory-descriptor list (MDL).
  • 9. The computer program product as recited in claim 7, wherein the processor communicates the instruction message to the transport offload engine identifying a location in memory where the data list and the data object are stored.
  • 10. The computer program product as recited in claim 1, wherein the data object communicates state information associated with the second network protocol layer to the first network protocol layer, wherein the first network protocol layer resides below the second network protocol layer.
  • 11. The computer program product as recited in claim 1, wherein the data object communicates state information associated with the first network protocol layer to the second network protocol layer, wherein the first network protocol layer resides below the second network protocol layer.
  • 12. The computer program product as recited in claim 1, wherein the data object includes the metadata including the byte indicator.
  • 13. The computer program product as recited in claim 1, wherein the data object includes a start indicator.
  • 14. The computer program product as recited in claim 13, wherein the start indicator is adapted for indicating a start of a protocol data unit (PDU).
  • 15. The computer program product as recited in claim 1, wherein the data object includes an end indicator.
  • 16. The computer program product as recited in claim 15, wherein the end indicator is adapted for indicating an end of a protocol data unit (PDU).
  • 17. The computer program product as recited in claim 1, wherein the data object includes a data integrity check.
  • 18. The computer program product as recited in claim 17, wherein the data integrity check includes the cyclic redundancy check (CRC) indicator.
  • 19. The computer program product as recited in claim 17, wherein the data integrity check includes the cyclic redundancy check (CRC) indicator, and the CRC indicator prompts calculation of the CRC, and transmission of the CRC with the data communicated in the network.
  • 20. The computer program product as recited in claim 19, wherein the CRC indicator prompts transmission of the status message to the processor that includes the CRC.
  • 21. The computer program product as recited in claim 1, wherein the transport offload engine utilizes the data object to process data associated with an upper one of the network protocol layers.
  • 22. The computer program product as recited in claim 21, wherein the processed data is inserted into a data stream in which the data is communicated utilizing the transport offload engine.
  • 23. The computer program product as recited in claim 21, wherein the processed data is fed back to the processor for retransmission purposes.
  • 24. The computer program product as recited in claim 21, wherein the processing is offloaded from the processor.
  • 25. The computer program product as recited in claim 21, wherein the data is only transmitted between the processor and the transport offload engine once.
  • 26. The computer program product as recited in claim 1, wherein the data object includes a transmission control protocol urgent (TCP URG) indicator.
  • 27. The computer program product as recited in claim 26, wherein the TCP URG indicator is adapted for indicating a number of bytes until a TCP URG section is complete.
  • 28. The computer program product as recited in claim 1, wherein the data list points to a 4-byte data segment and includes a dump CRC bit.
  • 29. The computer program product as recited in claim 1, wherein the data object includes an acknowledgment indicator.
  • 30. The computer program product as recited in claim 29, wherein the acknowledgment indicator is used for indicating a portion of the at least one queue that no longer needs to be stored.
  • 31. A system, comprising: a processor; anda transport offload engine in communication with the processor and a network via a bus, the transport offload engine capable of processing a data object for communicating between a first network protocol layer and a second network protocol layer;wherein the data object facilitates network communication management utilizing the transport offload engine;
  • 32. A method, comprising: receiving data utilizing a transport offload engine in communication with a processor and a network; andprocessing a data object for communicating between a first network protocol layer and a second network protocol layer;wherein the data object facilitates network communication management utilizing the transport offload engine;
  • 33. A method for communicating data in a network, comprising: utilizing a processor, writing to a queue information associated with data to be transmitted in a network, wherein the information in the queue includes a bit indicating whether the information is to be processed as a data list or metadata, each data list having an address in memory where the data to be transmitted is stored, a length associated with the data, and a flag indicating whether the data is to include a cyclic redundancy check (CRC), the metadata including a byte indicator indicating a number of bytes until a subsequent protocol data unit (PDU) or a CRC indicator used to clear a CRC of a socket to zero;transmitting an instruction message from the processor to a transport offload engine for indicating a start of the queue, an end of the queue, and a next read pointer;utilizing the transport offload engine to perform the operation of: accessing the information from the processor utilizing the instruction message via direct memory access (DMA) operations,processing each data list by accessing the data to be transmitted, segmenting the data, and transmitting the data in the network,wherein the metadata includes the CRC indicator, and further comprising generating a calculated CRC, transmitting the CRC with the data in the network, andtransmitting to the processor a status message that includes the CRC; andutilizing the processor, storing the CRC in response to the status message and used during a retransmission request without recalculation.
  • 34. The method as recited in claim 33, the queue is provided for one of a plurality of sockets.
  • 35. The method as recited in claim 34, wherein a control block is provided to track transmission of the data via the one of the plurality of sockets.
US Referenced Citations (222)
Number Name Date Kind
212889 Bridenthal, Jr. et al. Mar 1879 A
4807111 Cohen et al. Feb 1989 A
4839851 Maki Jun 1989 A
5012489 Burton et al. Apr 1991 A
5056058 Hirata et al. Oct 1991 A
5161193 Lampson et al. Nov 1992 A
5163131 Row et al. Nov 1992 A
5307413 Denzer Apr 1994 A
5426694 Hebert Jun 1995 A
5430727 Callon Jul 1995 A
5440551 Suzuki Aug 1995 A
5455599 Cabral et al. Oct 1995 A
5485460 Schrier et al. Jan 1996 A
5495480 Yoshida Feb 1996 A
5499353 Kadlec et al. Mar 1996 A
5513324 Dolin, Jr. et al. Apr 1996 A
5519704 Farinacci et al. May 1996 A
5544357 Huei Aug 1996 A
5546453 Hebert Aug 1996 A
5566170 Bakke et al. Oct 1996 A
5577105 Baum et al. Nov 1996 A
5577172 Vatland et al. Nov 1996 A
5577237 Lin Nov 1996 A
5579316 Venters et al. Nov 1996 A
5581686 Koppolu et al. Dec 1996 A
5596702 Stucka et al. Jan 1997 A
5598410 Stone Jan 1997 A
5619650 Bach et al. Apr 1997 A
5621434 Marsh Apr 1997 A
5625678 Blomfield-Brown Apr 1997 A
5625825 Rostoker et al. Apr 1997 A
5634015 Chang et al. May 1997 A
5636371 Yu Jun 1997 A
5640394 Schrier et al. Jun 1997 A
5650941 Coelho et al. Jul 1997 A
5663951 Danneels et al. Sep 1997 A
5664162 Dye Sep 1997 A
5666362 Chen et al. Sep 1997 A
5675507 Bobo, II Oct 1997 A
5678060 Yokoyama et al. Oct 1997 A
5680605 Torres Oct 1997 A
5684954 Kaiserswerth et al. Nov 1997 A
5687314 Osman et al. Nov 1997 A
5696899 Kalwitz Dec 1997 A
5699350 Kraslavsky Dec 1997 A
5701316 Alferness et al. Dec 1997 A
5727149 Hirata et al. Mar 1998 A
5734852 Zias et al. Mar 1998 A
5734865 Yu Mar 1998 A
5748905 Hauser et al. May 1998 A
5754540 Liu et al. May 1998 A
5754556 Ramseyer et al. May 1998 A
5754768 Brech et al. May 1998 A
5761281 Baum et al. Jun 1998 A
5778178 Arunachalam Jul 1998 A
5790546 Dobbins et al. Aug 1998 A
5790676 Ganesan et al. Aug 1998 A
5802287 Rostoker et al. Sep 1998 A
5802306 Hunt Sep 1998 A
5805816 Picazo, Jr. et al. Sep 1998 A
5809235 Sharma et al. Sep 1998 A
5815516 Aaker et al. Sep 1998 A
5818935 Maa Oct 1998 A
5826032 Finn et al. Oct 1998 A
5854750 Phillips et al. Dec 1998 A
5870549 Bobo, II Feb 1999 A
5870622 Gulick et al. Feb 1999 A
5872919 Wakeland Feb 1999 A
5877764 Feitelson et al. Mar 1999 A
5894557 Bade et al. Apr 1999 A
5909546 Osborne Jun 1999 A
5918051 Savitzky et al. Jun 1999 A
5920732 Riddle Jul 1999 A
5923892 Levy Jul 1999 A
5935268 Weaver Aug 1999 A
5937169 Connery et al. Aug 1999 A
5941988 Bhagwat et al. Aug 1999 A
5943481 Wakeland Aug 1999 A
5946487 Dangelo Aug 1999 A
5966534 Cooke et al. Oct 1999 A
5968161 Southgate Oct 1999 A
5974518 Nogradi Oct 1999 A
5991299 Radogna et al. Nov 1999 A
5999974 Ratcliff et al. Dec 1999 A
6014699 Ratcliff et al. Jan 2000 A
6034963 Minami et al. Mar 2000 A
6046980 Packer Apr 2000 A
6049857 Watkins Apr 2000 A
6061368 Hitzelberger May 2000 A
6061742 Stewart et al. May 2000 A
6076115 Sambamurthy et al. Jun 2000 A
6078736 Guccione Jun 2000 A
6081846 Hyder et al. Jun 2000 A
6092110 Maria et al. Jul 2000 A
6092229 Boyle et al. Jul 2000 A
6098188 Kalmanek, Jr. et al. Aug 2000 A
6101543 Alden et al. Aug 2000 A
6122670 Bennett et al. Sep 2000 A
6151625 Swales et al. Nov 2000 A
6157955 Narad et al. Dec 2000 A
6172980 Flanders et al. Jan 2001 B1
6172990 Deb et al. Jan 2001 B1
6173333 Jolitz et al. Jan 2001 B1
6182228 Boden Jan 2001 B1
6185619 Joffe et al. Feb 2001 B1
6208651 Van Renesse et al. Mar 2001 B1
6226680 Boucher et al. May 2001 B1
6230193 Arunkumar et al. May 2001 B1
6233626 Swales et al. May 2001 B1
6247060 Boucher et al. Jun 2001 B1
6247068 Kyle Jun 2001 B1
6247173 Subrahmanyam Jun 2001 B1
6327625 Wang et al. Dec 2001 B1
6330659 Poff et al. Dec 2001 B1
6334153 Boucher Dec 2001 B2
6341129 Schroeder et al. Jan 2002 B1
6345301 Burns et al. Feb 2002 B1
6347347 Brown et al. Feb 2002 B1
6389479 Boucher May 2002 B1
6389537 Davis et al. May 2002 B1
6393487 Boucher May 2002 B2
6397316 Fesas, Jr. May 2002 B2
6427169 Elzur Jul 2002 B1
6427171 Craft Jul 2002 B1
6427173 Boucher Jul 2002 B1
6430628 Connor Aug 2002 B1
6434620 Boucher Aug 2002 B1
6460080 Shah et al. Oct 2002 B1
6470415 Starr Oct 2002 B1
6530061 Labatte Mar 2003 B1
6591302 Boucher Jul 2003 B2
6609225 Ng Aug 2003 B1
6629141 Elzur et al. Sep 2003 B2
6658480 Boucher Dec 2003 B2
6687758 Craft Feb 2004 B2
6697868 Craft Feb 2004 B2
6751665 Philbrick Jun 2004 B2
6757746 Boucher Jun 2004 B2
6807581 Starr Oct 2004 B1
6938092 Burns Aug 2005 B2
6941386 Craft Sep 2005 B2
6965941 Boucher Nov 2005 B2
6996070 Starr et al. Feb 2006 B2
7032228 McGillis et al. Apr 2006 B1
7042898 Blightman May 2006 B2
7124205 Craft et al. Oct 2006 B2
7177941 Biran et al. Feb 2007 B2
7562158 Shah et al. Jul 2009 B2
20010021949 Blightman et al. Sep 2001 A1
20010023460 Boucher Sep 2001 A1
20010027496 Boucher et al. Oct 2001 A1
20010036196 Brightman Nov 2001 A1
20010037397 Boucher Nov 2001 A1
20010037406 Philbrick Nov 2001 A1
20010047433 Boucher et al. Nov 2001 A1
20020055993 Shah et al. May 2002 A1
20020085562 Hufferd et al. Jul 2002 A1
20020087732 Boucher Jul 2002 A1
20020091844 Craft Jul 2002 A1
20020095519 Philbrick Jul 2002 A1
20020120899 Gahan et al. Aug 2002 A1
20020147839 Boucher Oct 2002 A1
20020156927 Boucher Oct 2002 A1
20020161919 Boucher Oct 2002 A1
20020163888 Grinfeld Nov 2002 A1
20030005142 Elzur et al. Jan 2003 A1
20030005143 Elzur et al. Jan 2003 A1
20030014544 Pettey Jan 2003 A1
20030016669 Pfister et al. Jan 2003 A1
20030031172 Grinfeld Feb 2003 A1
20030046330 Hayes Mar 2003 A1
20030046418 Raval et al. Mar 2003 A1
20030056009 Mizrachi et al. Mar 2003 A1
20030058870 Mizrachi et al. Mar 2003 A1
20030061505 Sperry et al. Mar 2003 A1
20030066011 Oren Apr 2003 A1
20030079033 Craft Apr 2003 A1
20030084185 Pinkerton May 2003 A1
20030084212 Butterfield May 2003 A1
20030095567 Lo et al. May 2003 A1
20030115350 Uzrad-Nali et al. Jun 2003 A1
20030115417 Corrigan Jun 2003 A1
20030128704 Mizrachi et al. Jul 2003 A1
20030140124 Burns Jul 2003 A1
20030145101 Mitchell et al. Jul 2003 A1
20030145270 Holt Jul 2003 A1
20030163777 Holt Aug 2003 A1
20030167346 Craft Sep 2003 A1
20030177435 Budd et al. Sep 2003 A1
20030200284 Philbrick Oct 2003 A1
20040003126 Boucher Jan 2004 A1
20040037319 Pandya Feb 2004 A1
20040054813 Boucher Mar 2004 A1
20040062245 Sharp Apr 2004 A1
20040062246 Boucher Apr 2004 A1
20040064578 Boucher Apr 2004 A1
20040064589 Boucher Apr 2004 A1
20040064590 Starr Apr 2004 A1
20040073703 Boucher Apr 2004 A1
20040078462 Philbrick Apr 2004 A1
20040088262 Boucher May 2004 A1
20040100952 Boucher May 2004 A1
20040111535 Boucher Jun 2004 A1
20040117509 Craft Jun 2004 A1
20040158640 Philbrick Aug 2004 A1
20040158793 Blightman Aug 2004 A1
20040240435 Boucher Dec 2004 A1
20050122986 Starr Jun 2005 A1
20050141561 Craft Jun 2005 A1
20050143112 Jonsson Jun 2005 A1
20050149645 Tsuruta Jul 2005 A1
20050160139 Boucher Jul 2005 A1
20050175003 Craft Aug 2005 A1
20050182841 Sharp Aug 2005 A1
20050198198 Craft Sep 2005 A1
20050204058 Philbrick Sep 2005 A1
20050278459 Boucher Dec 2005 A1
20060009952 Anderson et al. Jan 2006 A1
20060010238 Craft Jan 2006 A1
20070062245 Fuller et al. Mar 2007 A1
20070206587 Ramaiah et al. Sep 2007 A1
20080253395 Pandya Oct 2008 A1
Foreign Referenced Citations (34)
Number Date Country
4595297 May 1998 AU
7364898 Nov 1998 AU
4435999 Dec 1999 AU
723724 Sep 2000 AU
0070603 Mar 2001 AU
734115 Jun 2001 AU
0741089 Nov 2001 AU
0228874 May 2002 AU
2265692 May 1998 CA
2287413 Nov 1998 CA
2328829 Dec 1999 CA
2265692 Aug 2001 CA
1237295 Dec 1999 CN
1266512 Sep 2000 CN
1305681 Jul 2001 CN
447205 Jul 2001 TW
448407 Aug 2001 TW
WO9821655 May 1998 WO
WO 9850852 Nov 1998 WO
WO 9965219 Dec 1999 WO
WO0013091 Mar 2000 WO
WO0027519 Sep 2000 WO
WO 0113583 Feb 2001 WO
WO 0128179 Apr 2001 WO
WO0227519 Apr 2002 WO
WO 0239302 May 2002 WO
WO 02059757 Aug 2002 WO
WO 02086674 Oct 2002 WO
WO 03021443 Mar 2003 WO
WO 03021447 Mar 2003 WO
WO 03021452 Mar 2003 WO
WO2005057945 Dec 2003 WO
WO 2005057945 Jun 2005 WO
WO2005057945 Jun 2005 WO