The present invention generally relates to communication systems, and more particularly relates to gateways operating in communication systems and even more specifically toward congestion control systems for use in such gateways.
Contemporary communication systems are designed in a tiered or layered arrangement typically rising from a physical layer to a link layer, then to a network layer above which is a transport layer then a middleware layer to ultimately an application layer; which is the layer users interface with when communicating through the network. Both wireline and wireless communication systems commonly move data in packets to minimize retransmission for dropped or corrupted data packets. A widely deployed wireline network utilizes the Transport Control Protocol (TCP)/Internet Protocol (IP) suite. TCP was originally designed for wireline networks, where packet losses are mostly caused by network congestions. The current TCP algorithm uses either a retransmission timer timing out, or receipt of three duplicate acknowledgements (ACKs) sent by receivers, to implicitly indicate data packet loss events.
However, networks with lossy links, such as radio frequency (RF) wireless networks, have a number of characteristics inherently different from wireline networks, for which the TCP/IP suite was originally designed. Notable among these differences is the transmission error measured by bit error rate (BER). Few errors per packet may be corrected by network layer encoding schemes, however, the network layer must also manage the congestion window (or buffer) of gateways at the physical layer to reduce dropped packets.
Since the original TCP protocol utilizes a packet loss as an indication of network congestion it can work against efficient data packet throughput when wireless networks are involved in a data packet flow. In a wireless network with lossy links, packet losses due to link errors are not caused by network congestions. Unfortunately, the current TCP protocol treats these losses as congestion losses, and in turn reduces the transmission speed, thus reducing communication throughput.
Unlike wireline TCP/IP networks, wireless links are characterized by high error rates. In most cases, packet losses due to corruption are more significant than congestion losses when a wireless link is involved in a TCP connection. In such a case, TCP may not be able to transmit or receive at the full available bandwidth, because the TCP algorithm will be unnecessarily reducing transmission speed in an attempt to avoid perceived congestion assumed to have been triggered by link errors. Consequently, the current congestion control algorithms in TCP result in very poor performance over wireless links.
Moreover, it is increasingly common for wireless networks to bridge onto classical wireline TCP/IP networks. The resulting patchwork of wireline and wireless networks may never operate at full performance if the TCP/IP algorithm operates to manage congestion over the wirelessly extended network. Also, wireline networks damaged by natural disasters or man made attacks can be rendered lossy and the legacy TCP algorithm will not be able to effectively control congestion.
Accordingly, it is desirable to have a congestion control algorithm for gateways operating in wireless, wireline or a combination wireless/wireline communication system that is able to differentiate and respond appropriately in the presence of congestion and corruption losses. The congestion control architecture should more optimally set the congestion window (or buffer) for gateways and be backward compatible with existing legacy networks. Furthermore, other desirable features and characteristics of the present invention will become apparent from the subsequent detailed description of the invention and the appended claims, taken in conjunction with the accompanying drawings and this background of the invention.
An apparatus is provided for congestion management in gateways operating in a wireless, wireline or a combination wireless and wireline communication system. The apparatus can support legacy assumption based congestion systems or speculation based congestion systems (denoted SpecTCP) or the alternative combination thereof (for legacy backward compatibility). The congestion management system optimally resizes, or not, congestion window (or buffer) sizing for the communication gateways. Application of the inventive congestion management system optimizes data recovery and throughput in communication networks, particularly those networks having lossy data links.
A method is provided for congestion management in gateways operating in a wireless, wireline or a combination wireless and wireline communication system. The method can support legacy assumption based congestion systems or speculation based congestion systems (denoted SpecTCP) or the alternative combination thereof (for legacy backward compatibility). The congestion management method optimally resizes, or not, congestion window (or buffer) sizing for the communication gateways. Application of the inventive congestion management method optimizes data recovery and throughput in communication networks, particularly those networks having lossy data links.
The present invention will hereinafter be described in conjunction with the following drawing figures, wherein like numerals denote like elements, and
The following detailed description of the invention is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background of the invention or the following detailed description of the invention.
To overcome the detriments of legacy congestion control systems when applied to lossy (e.g., wireless) links characterized by high error rates, the present invention has modeled congestion window size for gateways to optimally size the congestion window to improve data throughput. The congestion management algorithum an system of the preferred embodiment is particularly useful when applied to speculative techniques (i.e., speculating on the outcome of branch predictions) for throughput improvements when lossy links are involved in TCP/IP connections. Thus, the present invention helps to eliminates the waste of bandwidth responding to link errors.
Referring to
In accordance with the preferred embodiment of the present invention, enhancements to the network layer operate to optimally control the congestion window size for gateways at the physical layer (not shown). In one preferred embodiment of the present invention, the middleware layer 16 and the network layer 12 provide control functions and supporting parameters to the transport layer 14. At the transport layer 14, a conditional Bernoulli loss predictor 18 provides inputs to a congestion control and loss recovery module 20. Congestion control and loss recovery module 20 utilizes a speculation based algorithm according to a preferred embodiment of the present invention. Together, the conditional Bernoulli loss predictor 18 and the congestion control and loss recovery module 20 combine to provide SpecTCP congestion control. Alternatively, a preferred embodiment of the present invention could operate in a wireline communication system or a combination wirelss/wireline communication system.
At the network layer 12, a condition engine 22 provides required information for the conditional Bernoulli predictor 18. The condition engine 22 includes mathematical models of explicit congestion notification (ECN) capable random early drop (RED) gateways that produce networking parameters to minimize congestion losses at RED gateways, thus maximizing the accuracy of the conditional Bernoulli loss predictor 18 at the transport layer. To most effectively adjust RED gateway parameters so that congestion losses can be minimized, the present invention utilized mathematical models to dimension the buffer size inside a RED gateway. As dimensioned by the present invention, the RED gateway buffer sizes are much smaller than previously suggested values thus providing a significant contribution to the network performance improvement.
Assuming that both assumption based and speculation based congestion control are used, the middleware layer 16 has a congestion control manager 24 to select and control the execution of congestion control schemes at the transport layer 14. Based on the global knowledge of a network, the congestion control manager 24 makes and executes the decision whether the communication system should run an assumption based congestion control algorithm of the legacy assumption based congestion controller 26 (i.e., TCP/IP) or SpecTPC (the speculation based congestion control algorithm of the conditional Bernoulli loss predictor 18 and congestion control and loss recovery module 20). Moreover, the congestion control manager 24 at the middleware layer 16 functions as a bridge to ensure that end users with the communication system of the present invention can communicate with users of legacy networks. This is achieved by the ability of switching between the legacy assumption based congestion controller 26 and SpecTCP based upon the global ECN compatibility information obtained through the congestion control manager 24. Alternately, if only one type of congestion control system is used, the present invention within the network layer 12 would communicate with the congestion control system employed.
As illustrated in
To develop the congestion loss minimization models at RED gateways, the inventors derived expressions for the maximum buffer size and the maximum threshold of a RED gateway to minimize congestion packet losses. The minimization of congestion losses significantly improves the accuracy of speculating that loss events are due to link corruptions. This indicates that the condition engine within conditional Bernoulli predictor is optimized. Therefore, it is reasonable to set P=1 (i.e., to predict all incoming loss events are caused by link errors). As is known, ECN-capable RED gateways use an exponential weighted moving average to calculate an average queue size from the instantaneous queue size, and two thresholds (minimum and maximum), to determine whether an arriving packet should be dropped. If the average queue size is greater than the maximum threshold, the packet is dropped. If the average queue size is between the minimum and the maximum thresholds, the packet is marked with a probability as a Congestion Experienced (CE) packet. Packet losses due to the average queue size exceeding the maximum threshold at a RED gateway degrade TCP performance.
Those skilled in the art will be able to consider a typical model consisting of two RED gateways fed by multiple sources. As is known, the link connecting two RED gateways is the bottleneck link which causes congestion. The sources, destinations and the RED gateways use ECN for end-to-end congestion control.
The following notations will be used in the discussion of the inventive model in accordance with the present invention:
Q(t);Q(t)max: Instantaneous and maximum instantaneous queue sizes respectively at the RED gateway at time t.
Q, Qmax: Average and maximum average queue sizes respectively at the RED gateway.
w: Weighting factor for calculating Q.
(t): Marking probability at the RED gateway at time t.
minth, maxth: Minimum and maximum thresholds respectively of a RED gateway.
m: total number of TCP flows.
Wi(t): Window size of the ith TCP flow at time t, t, ≧0, i=1; . . . , m.
SSthreshi: Slow Start threshold for the ith TCP flow, i=1. . . , m.
ri: Round Trip Time (RTT) for the ith TCP flow, i=1; . . . , m. ri is replaced by r when all the RTTs are same.
μ: Average share of bottleneck link bandwidth of the ith TCP flow, i=1; . . . , m.
μ: Bandwidth of bottleneck link which is given by μ=Σi=1m μi.
T[1]: Waiting time for the first marking event after the average queue size exceeds minth.
βi: Number of window size increases during time T[1] for the ith TCP flow, i=1; . . . , m.
τ: Propagation delay from source i to the RED gateway, i=1; . . . , m.
t0: Time when the first packet is marked at the RED gateway.
t1: Time when the last packet, which was sent just before the first window size reduction, arrives at the RED gateway.
Packet drops at an ECN-capable RED gateway are either due to buffer overflows (Q(t) is equal to the buffer size) or Q>maxth. The congestion window (or buffer) size during the slow start phase increases very quickly. The average queue size (being the output of a low pass filter) of a RED gateway can not follow the quick change of Q(t); as a result Q stays less than minth. Therefore, Q(t) reaches the maximum value when the packet leaving the source at t−τi reaches the RED buffer. When this packet left the source, Wi(t−τi)=SSthreshi for i=1; 2 . . . , m; the queue size is smaller when the sources are in congestion avoidance. For m TCP flows, Q(t)max can be expressed as the output of a system with processing capacity of Σi=1m ri μi and the maximum input rate when sources reach their slow start threshold.
Thus:
According to one embodiment of the present invention, this is the buffer size used to minimize packet loss at the RED gateway.
Turning now to the derivation of the maximum average queue size, it is known that the recommended maxth=3×minth. When the average queue size is in the steady-state condition (during which the sources are in the congestion avoidance phase), the instantaneous queue size at time to is: Q(t0)=minth+Σi=1m βi. Since the difference between t0 and t1 is one RTT, and the window size of a source is increased by one per RTT during the congestion avoidance phase, the instantaneous queue size at time t1 can be expressed as: Q(t1)=minth+Σi=1m (βi1). The average queue size is estimated using an exponential weighted moving average as shown in Eq 1 above. If time is discretized into time slots with each slot being equal to one RTT, the RED's average queue size estimation algorithm at the k-th slot can be expressed as: Q[k+1]=(1−w)Q[k]+Q[k]w. In practice, w is very small, and the congestion window (or buffer) size increases by one every RTT during the congestion avoidance phase. Therefore, before the first marking event happens (i.e., no congestion control) it is reasonable to consider both the instantaneous queue size and the average queue size to be constant within a very short time period. Thus, by using Q(t1) (slot k is equal to t1 in time) the derivation above and assuming that the average queue sizes during the two previous consecutive time slots are the same, the average queue size estimated at time t, can be solved iteratively, which is: Qmax=Q=minth+Σi−1m (βi+w) (Eq 2). The first marking event is followed by many random ECN marking events, which make TCP sources adjust their congestion window sizes. The average queue size stays at a certain level smaller than the average queue size at time t1. Therefore, Eq 2 gives the maximum average queue size for minimizing packet losses and represents the value of maxth according to the present invention.
With the buffer size and maximum average queue size determined, consider again
In the algorithm 30 in SpecTCP, the congestion window size is appropriately controlled in the presence of either network congestion or corruption. In the preferred embodiment of the present invention, the congestion window is halved with the fast recovery algorithm when there is network congestion (as evidenced by ECN EHCO packets 36) and the congestion window size persists (remains substantially unchanged) at the previous value in the presence of corruption. Alternately, adjusting the congestion window (or buffer) size in the range of 40 percent to 80 percent may also be done, however, about a 50 percent reduction is one preferred embodiment. As will be appreciated by those skilled in the art, the ECN mechanism will be most effective if it is used with active queue management such as that found in contemporary RED gateways. In active queue management, when a buffer reaches a certain threshold, the RED gateway will send a CE packet to the TCP receiver before the buffer overflows. Therefore, packet drops due to congestion happen only after the RED gateway has sent CE packets. By optimally dimensioning the congestion window (or buffer) size per the present invention, and using the maximum threshold for the active queue management for ECN capable RED gateways, the accuracy of the conditional Bernoulli loss predictor 18 can be optimized.
In
while the maximum threshold size 60 is set according to: Qmax=Q=minth+Σi−1m (βi+w) will minimize congestion as well as result in better prediction should the present invention be utilized in prediction based congestion management systems.
Returning again to
As will be appreciated by those skilled in the art, the present invention is effective in improving network throughput over lossy links. The inventive communication system is able to optimally adjusting congestion window (or buffer) sizes in gateways. Also, when used in combination with a speculation based congestion system, the message sender does not have to waste time and bandwidth (i.e., congestion window size backoff) waiting for implicit network information about the losses. Therefore, network throughput effectively is enhanced.
Another benefit of the present invention is that it does not starve other competing data flows. Under the normal working condition, no matter which congestion control algorithm is applied, all users are controlled by congestion window evolutions, which was designed to reduce unfairness. Unfairness in data flow is more likely to happen during failure modes of the system. For example, if the communication system incorrectly speculated a congestion loss as a link corruption loss, and ECN packets used to indicate congestions were lost, the system would not decrease the sender's congestion window size (but it should), which could result in starvation of other competing legacy TCP data flow. To improve the ECN algorithm reliability in one embodiment of the present invention, ECN packets are transmitted continuously until the sender acknowledges that ECN packets are received.
While at least one exemplary embodiment has been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention. It being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.