The present invention relates to improving TCP performance and, more particularly, to a method for reducing packet loss using cross layer information during data transfer.
Transport control protocol (TCP) uses several mechanisms to avoid congestion, including slow start, congestion avoidance, fast retransmit and recovery, and selective acknowledgement. The packet drops in the path severely reduce the throughput of a TCP source since a packet drop is seen an indication of network congestion. A TCP source, after detecting a packet loss, employs a slow start and congestion avoidance procedure to limit the rate at which packets are sent to the network. A TCP source's internal parameter, transmission window size, controls the number of outstanding unacknowledged packets in the communication pipe between a sender and the receiver. Every time a packet is dropped in the path, a TCP source must follow the slow start and congestion avoidance procedure. This procedure involves reducing the transmission window to just one packet, and then slowly increasing it after an acknowledgement is received from the receiver. Multiple packet losses in a short duration may result in repeated aborting of the recovery procedure, and restarting it again from the initial state of transmission window of just one packet. This can have very serious impact on the throughput of a source. However, currently there is no procedure in the current art that lets a TCP source minimize multiple packet losses, and hence avoid the resulting drastic throughput reduction. A TCP data source currently cannot request any special handling of packets by intermediate packet routing entities along a network path to avoid multiple losses. This inability of the TCP data source may cause unfair distribution of bandwidth among competing sources, especially when a network link is congested and an intermediate routing node can only accommodate a limited amount of network traffic. In this scenario, packets from a TCP data source can be repeatedly dropped by the intermediate routing node making recovery attempts to fail repeatedly and a node can be in the recovery mode for quite sometime. As a result this node can suffer unduly while other nodes may not see any packet loss at all. This scenario is particularly acute in a wireless network environment.
Therefore, it is desirable to provide a method for reducing multiple packet losses for a TCP data source so that each source is able to have its fair share of network bandwidth.
In accordance with the present invention, an improved method is provided for reducing packet loss using cross layer information. In one aspect of the invention, the method includes: monitoring a transmission state of a data source, where the transmission state is defined in accordance with TCP's internal state machine; marking data packets with a pre-defined code (e.g., drop preference) at an Internet Protocol (IP) layer of the data packets based on the transmission state; and transmitting the marked data packets from the data source. The code used for marking in the IP header of the data packet conveys the requested handling of the packet at intermediate routing or forwarding entities. In a preferred embodiment, this code may indicate drop preference for the packet with regards to packets from other sources.
Further areas of applicability of the present invention will become apparent from the detailed description provided hereinafter. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Transport Control Protocol (TCP) is a well known communication protocol that operates at the transport layer (i.e., layer 4) of the Open System Interconnection model. Briefly, TCP enables two hosts to establish a connection and exchange streams of data across a network environment. In addition, TCP provides means which ensure delivery of data between the two hosts. While the following description is provided with reference to TCP, it is readily understood that the broader aspects of the present invention are applicable to other types of communication protocols, such as User Datagram Protocol (UDP), which may operate at the transport layer.
TCP uses several mechanisms to avoid congestion in the network such as slow start, congestion avoidance, fast transmit and fast recovery, and selective acknowledgement. Even though slow start and congestion avoidance are two separate algorithms, in practice they are implemented together. The sender keeps two state variables for congestion control: a slow start/congestion window, cwnd, and a threshold size, ssthresh, to switch between two algorithms. The sender's output routine always sends the minimum of cwnd and the windows advertised by the receiver. ssthresh is initialized to a large value. On a timeout caused due to a packet loss, half the current window size (cwnd) is recorded in ssthresh (this is multiplicative decrease part of the congestion avoidance algorithm), then cwnd is set to 1 packet (this initiates slow start).
When new TCP data is acknowledged, if cwnd is less than ssthresh, cwnd is incremented by 1/cwnd. Thus slow-start opens the windows quickly to what congestion avoidance thinks is a safe operating point, then congestion avoidance takes over and slowly increases the window size to probe for more bandwidth becoming along the network path.
A known problem with the TCP congestion control algorithm is that it allows a potentially inappropriate burst of traffic to be transmitted after TCP has been idle for a relatively long period of time. After an idle period, TCP cannot use the acknowledgement clock to strobe new segments in to the network, as all acknowledgements have been drained from the network. Therefore, as specified above, TCP can potentially send a cwnd-size line rate burst into the network after an idle period. The current standard recommends that a TCP should use slow start to restart transmissions after a relatively long period of inactivity. Slow start serves to restart the Ack clock just as it does at the beginning of a transfer.
TCP throughput is severely affected if multiple packets are dropped when TCP is trying to recover from previous loss. Intuition would suggest that avoiding multiple packet loss for a TCP flow would avoid this severe impact on TCP throughput. In addition, the relationship between packet loss and throughput can be analyzed mathematically. TCP throughput λ, for a flow with TCP's additive increase multiplicative decrease (AIMD) congestion control (increase factor α=1 packet, decrease factor, β=½) can be represented with a simple equation given below:
λ≧S/R√(p)
where S is the packet size and R is the roundtrip time between a source and the destination. From this equation, one can see as p increases throughput will decrease. If on a congested path shared by multiple flows for which S and R are comparable, if we can make p to distribute equally, then each flow will achieve its fair share. The present invention proposes a method for reducing packet loss using a cross layer information transfer mechanism, thereby improving TCP throughput.
Since each TCP data source maintains its current transmission state, this information may be used to alter the forwarding behavior of data packets sent from the data source. For example, the transmission state is passed from the transport layer to the network layer as shown in
Data packets may be marked in a manner which is similar to a technique employed by the Differentiated Services Architecture. The differentiated services architecture is based on a model where traffic entering a network is classified at the boundaries of the network and assigned to different behavior aggregates. Within the core of the network, packets are forwarded according to the per-hop behavior associated with the applicable behavior aggregate. More specifically, data packets are marked using the DS field in the IPv4 or IPv6 header of the packet. With reference to
To mark packets, TCP data source is configured to correlate the transmission state to an applicable codepoint value. For example, when the transmission state indicates that the data source is operating in a congestion control mode (i.e., a slow start state or a congestion avoidance state), then data packets to be sent from the data source are marked with a low drop preference; whereas, when the transmission state indicates that the data source in not operating in a congestion control mode (e.g., operating in a maximum transfer mode or an otherwise normal operational mode), then data packets are marked with a normal drop preference (i.e, marked with best effort forwarding preference). In this example, data packets having a low drop preference are less likely to be dropped than data packets having a normal drop preference.
In an exemplary embodiment, recommended codepoints are used to mark data packets in an assured forwarding per-hop-behavior group. A codepoint, AFny, encodes a class number, n, and a drop preference value, y, within the class. Thus, packets with a codepoint of AF13 will be dropped before packets with a codepoint of AF12 which in turn will be dropped before packets with a codepoint of AF11. Further details regarding the Assured Forwarding PHB Group may be found in RFC 2597 which is incorporated herein by reference.
At a network layer, the data source may override the codepoint value assigned to a data packet based on the transmission state passed from the transport layer as shown in
Assuming that the routers in the network path use the well known first come first serve (FCFS) queuing discipline, all incoming IP packets on an interface are added to an output queue. If the buffer becomes full due to congestion, the router drops new incoming packets. The marking of packets by the data source can be treated as assured forwarding (AF) PHB marking, and the routers can treat them in the same way. Since it would mean that IP packets sent from a data source experiencing a slow start would be added to the tail of the queue even when the buffer is full. An AF PHB requires a router to free buffer from the output queue by dropping the packets already in the queue that have no such requirements of minimal loss. This would avoid the multiple packet drops for the TCP flow and hence improve the throughput and fairness amongst all TCP flows.
While this exemplary embodiment has modified the codepoint value to altering packet forwarding behavior, it is readily understood that the present invention is not limited to this technique. Rather, it is envisioned that the data packet may be marked in many different ways to obtain preferential treatment along the packet route to improve throughput, quality of service or improving overall communication performance. Moreover, it is envisioned that the cross layer transfer mechanism may be used not only to affect the throughput of a TCP flow, but also to affect other characteristics of network traffic.
A similar technique may also be employed in a wireless network environment experiencing network congestion. In such a network environment, the feedback from the TCP layer can be passed to the IP layer with the payload, and the IP layer, in turn, passes the payload along with the TCP state information to a lower layer as defined by the OSI model. For instance, the TCP state information may be passed to the Media Access Control (MAC) layer. Since packet forwarding occurs at the MAC layer, this layer can use this information to provide better priority, or guaranteed delivery to packets using suitable bandwidth allocation techniques. It is readily understood that the TCP state information may be passed to other layers of the OSI model for similar purposes.
An example of how cross layer information can be used to influence forwarding behavior is to look at the MAC layers of forthcoming ultra-wideband (UWB) wireless technologies or at the MAC protocol for IEEE 802.11 wireless LAN networks. UWB technologies have two competing MAC/PHY proposals: IEEE 802.15.3 MAC and WiMedia UWB MAC. Both these MACs have two operation modes to support two different applications, i.e., real time (RT) media streaming and non-real-time (NRT) data. RT services include audio/video streaming. NRT data services include Web browsing, email, and file transfer. The first transmit operation mode in the wireless MACs is contention mode like contention access period (CAP) in IEEE 802.15.3 MAC or EDCA—Enhanced distributed channel access mode in WiMedia MAC which is similar to Enhanced CSMA/CA with QoS support in IEEE 802.11e. In this mode, devices compete for access to channel and may retransmit in case a collision is detected because many devices transmitting in the same period since no reservation of a wireless channel is done. In some cases, they may leave recovery of the lost packets to higher layer protocols. The second transmission mode is contention free mode like the channel time allocation period (CTAP) in IEEE 802.15.3 MAC or the distributed reservation protocol (DRP) in WiMedia UWB MAC. In the mode, a device makes a reservation of wireless channel before transmitting data to avoid any corruption of the packet. RT services are usually supported by doing a fix time allocation for channel using this mode, but other contention mode can also be used.
In case packet forwarding occurs in the MAC layer only, the cross layer information from TCP can be first passed to IP layer and which in turn can pass this information to the MAC layer along with the payload. The MAC layer can then set the suitable mode depending upon the cross layer information received from the upper layer. For example, when a TCP source requests the low packet drop preference to the IP layer, the IP layer must request the Mac layer to send the pay load only using non-contention mode to guarantee delivery of the packet.
The description of the invention is merely exemplary in nature and, thus, variations that do not depart from the gist of the invention are intended to be within the scope of the invention. Such variations are not to be regarded as a departure from the spirit and scope of the invention.