Network communication with path MTU size discovery

Description

FIELD OF THE INVENTION

This invention generally relates to network communications and more particularly to discovery of a maximum transmission unit (MTU) size in a path between two nodes of a network.

BACKGROUND OF THE INVENTION

When one Internet Protocol (IP) host has a large amount of data to send to another host, the data is transmitted as a series of Internet Protocol (IP) datagrams. It is often preferable that these datagrams be of a largest size that does not require fragmentation anywhere along the path from the source to the destination. This datagram size is referred to as the Maximum Transmission unit (MTU) for the path and is sometimes referred to as the Path MTU or PMTU. The Path MTU is equal to the minimum of the MTUs of each hop in the path.

Fragmenting a packet involves dividing the packet into smaller packets and adding a header to each smaller packet. Since each fragment has the same header overhead as the original message, fragmenting packets adds to the total number of bytes that need to be transmitted in order to transmit the message. This can slow down transmission. It is therefore advantageous to discover Path MTU in order to avoid fragmenting packets.

A shortcoming of the prior art is the lack of an adequate mechanism for discovering the MTU of an arbitrary path between two hosts. Prior art techniques for Path MTU discovery are described e.g., in RFC 1191, “Path MTU Discovery”, by J. Mogul and S. Deering, which is available on the Internet at http://www.ietf.org/rfc/rfc1191.txt?number=1191, the contents of which are incorporated herein by reference. RFC 1191 describes a technique for Path MTU discovery by setting the “do not fragment” (DF) flag on packets sent by the host. If a router in the path has an MTU size smaller than the packet size, an Internet Control Message Protocol (ICMP) error is returned and the packet is dropped. Otherwise, the packet is received by the intended recipient, which verifies receipt of the packet. Unfortunately, administrative privilege is often required in order to be able to set the DF flag. In addition, not all routers are configured to provide the ICMP messages that are relied upon in this technique. In fact, most routers are not so configured.

Additional prior art path MTU discovery techniques are described by M. Mathis and J. Heffner in an internet draft titled “Packetization Layer Path MTU Discovery”, a copy of which is available on the internet at: <http://www.ietf.org:80/rfc/rfc4821.txt?number=4821>, the contents of which are incorporated herein by reference.

This RFC addresses issues with classic Path MTU discovery, which include “ICMP black holes” and ICMP blockage by firewalls. However, Packetization Layer Path MTU Discovery (PLPMTUD) technique still has a number of drawbacks. For example, PLPMTUD techniques must be able to set the do not fragment (DF) bit to 1 for packet loss detection. Unfortunately, the DF bit cannot be controlled from applications. In addition, PLPMTUD needs to be supported by both IP layer and the TCP/IP layer to work, and is not yet widely implemented.

It is within this context that embodiments of the present invention arise.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention may be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:

FIG. 1 is a block diagram illustrating a network path between two hosts.

FIG. 2 is a block diagram illustrating the protocol stacks in the hosts and routers of FIG. 1.

FIG. 3A is a graph illustrating the data transmission rate of a packet rate limited router as a function of packet size.

FIG. 3B is a graph illustrating the data transmission rate of a bit rate limited router as a function of packet size.

FIG. 4 is a flow diagram of a method for Path MTU discovery according to an embodiment of the present invention.

FIG. 5A is a graph illustrating an example of a false detection of a packet rate limited router.

FIG. 5B is a graph illustrating an example of a false detection of a bit rate limited router.

FIG. 6 is a schematic diagram of an apparatus for path MTU discovery according to an embodiment of the present invention.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

Although the following detailed description contains many specific details for the purposes of illustration, anyone of ordinary skill in the art will appreciate that many variations and alterations to the following details are within the scope of the invention. Accordingly, the exemplary embodiments of the invention described below are set forth without any loss of generality to, and without imposing limitations upon, the claimed invention.

Embodiments of the present invention are directed to methods and apparatus for discovery of a maximum transmission unit (MTU) size in a path between a first host and a second host connected by a network. A plurality of test packets of varying transmission unit (TU) size may be sent from the first host to the second host. A “do not fragment” (DF) flag for the test packets is not set. It is determined whether one or more of the test packets were received by the second host. An estimated path MTU size may then be calculated based on one or more patterns of receipt of the test packets by the second host.

TECHNICAL BACKGROUND

Embodiments of the present invention may be understood in the context of network communications. FIG. 1 illustrates an example of network communication between Host 1102 and Host 2104. By way of example, the hosts may be any network capable device. Such devices include, but are not limited to computers, hand held internet browsers and/or email devices, Voice over Internet Protocol (VoIP) phones, video game consoles, hand held video game devices, and the like. Messages from Host 1 travel to Host 2 over a network path 103 via routers 106, 108, and 110. Each router may have a different Maximum Transmission Unit (MTU). In this example, router 106 has an MTU of 1500 bytes, router 108 has an MTU of 1000 bytes and router 110 has an MTU of 1500 bytes. The path MTU for the path 103 is the smallest MTU of any router in the path, which is 1000 bytes in this example.

The Hosts 102, 104 and routers 106, 108, 110 may be configured to communicate with each other according to a network protocol. FIG. 2 illustrates an example of a network protocol configuration for the situation shown in FIG. 1. By way of example, each host device 102, 104 may be configured (either in software or hardware or some combination of both) with a network protocol stack having five layers: an Application layer APP, a Transport layer TRANS, a Network layer NET (sometimes referred to as the IP layer), a Data Link Layer DLL and a Physical layer PHYS. These layers are well-known to those of skill in the art.

The Hosts 102, 104 typically implement all five layers. The routers 106, 108, 110 typically implement only the Network, Data Link and Physical layers.

By way of example, embodiments of the present invention may implement Path MTU discovery at the Application layer. Typically, the Transport layer and below are implemented in an operating system (OS) kernel and applications have no control in changing behavior at these layers. Classic PMTUD, by contrast, is typically implemented at the Transport and IP (Network) layers.

The Application layer APP represents the level at which applications access network services. This layer represents the services that directly support applications such as software for file transfers, database access, and electronic mail. Examples of application layer software include HL7, Modbus, Session Initiation Protocol (SIP), and Simple Sensor Interface Protocol (SSI). In the particular case of the TCP/IP suite, the Application layer APP may be implemented with software protocols such as Hypertext Transfer Protocol (HTTP), Session Initiation Protocol (SIP), Simple Mail Transfer Protocol (SMTP), Short Message Peer-to-Peer Protocol (SMPP), Simple Network Management Protocol (SNMP), File Transfer Protocol (FTP), Teletype Network (TELNET), Network File System (NFS), Network Time Protocol (NTP), Real-time Transport Protocol (RTP), Dynamic Host Configuration Protocol (DHCP), and Domain Name System (DNS). The Application layer APP may sometimes be divided further into a Presentation layer and a Session layer, e.g., in the Open Systems Interface (OSI) protocol. The Presentation layer translates data from the Application layer into an intermediary format. The Presentation layer may also manages security issues by providing services such as data encryption, and compresses data so that fewer bits need to be transferred on the network. The Session layer allows two applications on different computers to establish, use, and end a session. The Session layer may establish dialog control between the two computers in a session, regulating which side transmits, plus when and how long it transmits.

The Transport layer TRANS handles error recognition and recovery. For a transmitting host, the Transport layer may also repackage long messages when necessary into small packets for transmission. For a receiving host the Transport layer rebuilds packets into the original message. The Transport layer for a receiving host may also send receipt acknowledgments. Examples of particular Transport layer protocols include Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Stream Control Transmission Protocol (SCTP), all of which, and equivalents thereof, are well-known to those of skill in the art. The Transport layer TRANS is the layer that typically supports packet fragmentation. It is noted that fragmentation may take place in the Transport layer of the host originating a message or at the Transport layer of any of the routers along the path between that host and the message's intended recipient.

The Network layer NET addresses messages and translates logical addresses and names into physical addresses. It also determines the route from the source to the destination computer. The Network layer may also manages traffic problems, such as switching, routing, and controlling the congestion of data packets. Examples of particular Network layer protocols include, but are not limited to, Internet Protocol (IP), Internet Control Message Protocol (ICMP), IP Security (Ipsec), Address Resolution Protocol (ARP), Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) all of which, and equivalents thereof, are well-known to those of skill in the art.

The Data Link layer DLL packages raw bits from the Physical layer PHYS into frames (logical, structured packets for data). The Data Link layer may also be responsible for transferring frames from one computer to another, without errors. After sending a frame, the Data Link layer DLL waits for an acknowledgment from the receiving computer. Examples of particular Data Link layer protocols include, but are not limited to, Point-to-Point Protocol (PPP), Serial Line Internet Protocol (SLIP) and Media Access Control (MAC) all of which, and equivalents thereof, are well-known to those of skill in the art. The Data Link layer DLL typically limits the MTU size.

The Physical layer PHYS transmits bits from one computer to another and regulates the transmission of a stream of bits over a physical medium. This layer defines how the cable is attached to the network adapter and what transmission technique is used to send data over the cable. Examples of particular Physical layer protocols and standards include, but are not limited to, RS-232, V.35, V.34, I.430, I.431, T1, E1, 10BASE-T, 100BASE-TX, POTS, SONET, DSL, 802.11a, 802.11b, 802.11g, 802.11n all of which, and equivalents thereof, are well-known to those of skill in the art.

A message originating at Host 1102 starts at the Application layer APP and works its way down the protocol stack to the Physical layer PHYS. When the message arrives as Host 2104, it is received at the Physical layer PHYS and works its way up the stack to the Application layer APP. In the path 103 between the two hosts 102, 104, the message is received at the Physical layer PHYS of router 106 and works its way up to the Transport layer TRANS and then back down the stack to the Physical layer PHYS for transmission to router 108. The process repeats for routers 108 and 110. In peer-to-peer situations, once a connection has been established between the hosts 102, 104 they may communicate directly by peer-to-peer connections 105, e.g., at the Application layer APP or at the Transport layer TRANS.

Path MTU Discovery Method and Apparatus

By way of example, embodiments of invention may be applied to discovery of MTU size defined at the IP (Network) layer. Alternatively, MTU size discovery as described herein may be equally applied to any supported transport protocol.

According to embodiments of the present invention, Path MTU discovery may be based on two observations. The first observation is that most routers will properly fragment packets that conform to certain Transport Layer protocols. An example of such a protocol is the Uniform Datagram Protocol (UDP). UDP is a minimal message-oriented transport layer protocol that is described, e.g., by J. Postel in IETF RFC 768, Aug. 28, 1980, which may be accessed on the Internet at http://tools.ietf.org/html/rfc768, the contents of which are incorporated herein by reference. In the Internet protocol (IP) suite, UDP may provide a very simple interface between a network layer below (e.g., IPv4) and a session layer or application layer above. UDP is often described as being a connectionless protocol. As used herein connectionless, refers to network protocols in which a host can send a message without establishing a connection with the recipient. That is, the host simply puts the message onto the network with the destination address and hopes that it arrives. Other examples of connectionless protocols include Ethernet and IPX. UDP is typically used for message broadcast (sending a message to all on a local network) or multicast (sending a message to all subscribers). Common network applications that use UDP include the Domain Name System (DNS), streaming media applications such as Internet Protocol Television (IPTV), Voice over IP (VoIP), Trivial File Transfer Protocol (TFTP) and online games.

The second observation is that routers tend to exhibit one or two particular types of bandwidth limitation behavior. Specifically, router bandwidth limitation may be classified as being either packet rate limited or bit rate limited. In a packet rate limited router, the data transmission rate is determined by a number of packets the router can transmit per unit time. For a packet rate limited router, the size of the packets does not affect the number of packets the router can send per unit time as long as the packets are no larger than some maximum packet size, which determines the MTU for that router. Packet rate limited routers are sometimes referred to herein as being packet-per-second (pps) limited. For a pps limited router, it makes sense to send packets that are as large as possible in order to optimize the data transmission rate. For a bit rate limited router, by contrast, the data transmission rate is determined by a maximum number of bits per unit time that is independent of the packet size. Bit-rate limited routers are sometimes referred to herein as being bit-per-second (bps) limited. It is noted that both bps-limited routers and pps-limited routers may fragment a packet depending on the MTU set to the router.

The difference in behavior of the packet rate limited and bit rate limited routers is illustrated in FIGS. 3A-3B. Specifically, FIG. 3A graphically depicts the data transfer rate for UDP packets as a function of transmission unit TU size for a packet rate limited router. Packets at an initial size are sent at an initial bandwidth BW₀(e.g., 64 Kilobits per second (Kbps)). Preferably the sending host has the ability to “throttle” the bandwidth with which the packets are sent. Such a “slow-start” approach is often useful since packets are queued at each node. A long queue increases latency, which is undesirable. Long queues also tend to take a long time to be recovered. Embodiments of the present invention avoid this by adjusting the sending bandwidth BW while keeping the TU size fixed. Each packet includes a request for the receiving host to provide the data transfer rate (e.g., in bits per second (bps) for the received packets. As the bandwidth is increased, the data transfer rate for the received packets will continue to increase until the bandwidth reaches a packet-limit saturation. At this point, increasing the bandwidth does not further increase the data transfer rate for the packets since the router transmits a fixed number of packets per second. For packets that are smaller than the router's MTU, the packet-limit saturated data transfer rate increases approximately linearly as the packet size increases as indicated by the line 302. For example, if the path contains a router having a packet limit of 32 packets per second and an initial packet size of, e.g., 480 8-bit bytes, the data transfer rate for the packets will saturate at about 120 Kbps. If the packet size is increased by 50%, e.g., to 720 bytes, but remains below the MTU size for the router, the bandwidth will saturate at about 180 Kbps. Such linear behavior is characteristic of a pps-limited router. Packets that are greater than the MTU size for the router are fragmented into two or more packets. As a result, the number of packets increases but the packet transmission rate does not. Consequently, the data transmission rate abruptly drops just beyond the MTU size. If the packet size is again increased, the data transmission rate for a pps-limited router is expected to increase in an approximately linear fashion until the packet size reaches another integer multiple of the MTU size.

Rate limitation, which may occur either intentionally or unintentionally, could happen at any layer in the protocol stack. One “intentional” case that is very common is to set up IP tables (set policies within the IP and transport layers) to throttle bandwidth. Bandwidth saturation may be detected at the receiver side by observing packet loss and increase of latency. As described above, there are a series of queues in the path. When saturation occurs somewhere in the path, a queue right before the saturation point starts accumulating packets. This may be observed as an “increase of latency” at the receiver by checking timestamps added to each packet. Eventually, the queue becomes full and packets start being dropped, which may also be observed at the receiver side by checking sequence numbers attached to each packet.

FIG. 3B graphically depicts the data transfer rate for UDP packets as a function of packet size for a bit rate limited router. It is noted that a bit rate limited router is generally not sensitive to fragmentation of the packets because it is not affected by the number of packets sent per second. For example, sending 1000 bytes/sec or 500 byte packets at 2 packets/sec is the same for a bit rate limited router. However, although the bandwidth may be more or less fixed for such a router, the data transfer rate (e.g., in bits per second) may vary due to a more or less constant latency associated with transmission of each packet. As a result of the latency, the data transfer rate for a bps-limited router will tend to increase with packet size. However, as the data transmission rate approaches a bandwidth limit BW_L, the transmission rate will tend to flatten off as a function of packet size.

Thus, based on an understanding of the two types of router behavior illustrated in FIGS. 3A-3B, path MTU discovery may proceed according to a method 200 as shown in FIG. 4. As indicated at 202, test packets may be transmitted from one host to a recipient (e.g., from host 102 to host 104, with a small initial TU size and a small initial transmission bandwidth BW. (See FIG. 1). The DF flag for these packets is not set so that routers along the path 103 may fragment the packets normally, if they are configured to do so. As the packets are received, the transmitting host determines the data transfer rate for the packets, as indicated at 204. By way of example, each packet may include a request for the receiving host to send back a message that indicates the data transfer rate R for the test packet. The sending host probes for saturation behavior at 206. If saturation is not observed, the transmission bandwidth BW is gradually increased with the same TU size at 208, while probing packet loss and growth of delay at receiver side. When significant packet loss or growth of delay is detected, it may be assumed that the bandwidth with the TU size is saturated. The values of TU and R may be recorded at this point as indicated at 210. The TU size may then be increased, e.g., by 50% of the initial TU size. If the bandwidth is pps limited, it is expected that the bandwidth will grow linearly with TU size until the TU size (or an integer multiple thereof) is reached. If the TU size exceeds the actual path MTU size and the bandwidth is pps-limited, the receiver will detect that the data transfer rate is less than for the previous TU size. The example in FIG. 5A shows how the data transfer rate may behave when the TU size exceeds the actual path MTU size. When TU size exceeds the path MTU size, an intermediary node that has the MTU size set will start fragmenting long packets to fit them into the MTU size. This causes an increase in the number of packets, and a consequent decrease in the transfer rate since the bandwidth is pps limited. Specifically, just above the MTU size, the packets are split into two, which results in a drop in data transfer rate by one half. Just above twice the MTU size the packets are split into three, which results in a drop in data transfer rate by one third. Just above three times the MTU size the packets are split into four, which results in a drop in data transfer rate by one fourth. By detecting this bandwidth drop, network applications can detect the path MTU size to maximize available bandwidth.

If the bandwidth is bps limited, by contrast, the bandwidth will tend to grow until it reaches a bandwidth saturation level, e.g., as shown in FIG. 3B. The data transfer rate for a bps-limited tends to flatten off without the characteristic drops seen in FIG. 5A. Thus, it is possible to determine router behavior and path MTU size by observing the dependence of data transfer rate R on TU size. By way of example, after each R and TU value have been recorded at 210 the sending host may check at 212 to determine if the latest value of R is less than the previous value of R. If so, the path MTU may be determined from the behavior of R versus TU at 214 based on a packet-rate limit assumption. If saturation of R as a function of TU (e.g., as shown in FIG. 3B) is detected at 216, the path MTU may be calculated based on a bit-rate limit assumption at 218. If such saturation behavior is not detected, the TU may be increased at 220 and the process may repeat at 202, 204, 206, etc. Once the Path MTU size has been determined, message packets of a size less than or equal to the Path MTU size may be sent over the path 103 to the second host 104, as indicated at 222. It is noted that the host that performs the path MTU discovery need not be one that sends the message packets. For example, if two or more hosts are connected to the second host 104 by the same path 103 it is possible for one of these hosts to perform path MTU discovery and notify another of these host of the path MTU. Any of these hosts may then send message packets that are less than or equal to the path MTU size over the path 103.

It is important that the initial TU size and the increase in TU size be chosen carefully in order to detect the drop in data transfer rate. For example, if the TU size increase is 100% (doubled), the receiver side may not detect enough of a bandwidth drop to be confident that the TU size exceeded the actual path MTU size. FIG. 5A illustrates an example of the problem. Suppose the path is packet rate limited with a path MTU (PMTU) is about 576 bytes and the initial TU size is 480 bytes. If the TU size increases by 480 bytes each time, the drops in transfer rate at PMTU, 2×PMTU and 3×PMTU might not be observed. The measured data transfer rate may be the same at TU size 480 as for 960 and 1440. Such behavior (represented by the dashed curve 502) might be misinterpreted as a characteristic of a bit-rate limited path. Thus, too large an increase in TU size can lead to an erroneous determination of bit-rate limited path behavior. If, instead, the TU size increases by 50% of the initial TU size, the pattern of measurements at TU sizes of 480, 720, 960, 1200 and 1440 bytes reveals the bandwidth drop at PMTU, 2×PMTU, 3×PMTU, etc., as represented by the solid curve 504, behavior which is associated with a packet rate limited path.

FIG. 5B illustrates an example of a possible erroneous determination of packet rate limited behavior. It is possible that the bit-rate limit for a pps-limited router may change over time. If, for example, there is a sudden drop in the bit-rate after an initial measurement at a TU size of 480 bytes, the actual pps-limited router behavior, represented by the solid curve 512, may exhibit a sharp drop in data transfer rate followed by a recovery to a lower value. In such a case, it is possible that subsequent measurements taken at 720-byte and 960-byte TU sizes may be misinterpreted as being characteristic of packet rate limited behavior, represented by the dashed lines 514. Such an erroneous determination may be avoided, e.g., by sending out test packets at sufficiently short intervals of time that the bit rate doesn't change dramatically during the interval. In addition, it may be useful to send out packets over a sufficiently broad range of TU sizes that the true behavior may be determined.

From FIGS. 5A-5B it may be seen that “false positives” may be avoided if the initial TU size and TU size increase are not too large. For most, if not all, routers the MTU size is greater than or equal to some minimum value, e.g., 576 bytes. Thus, it is desirable to choose the initial TU size to be slightly less (e.g., about 15-20% less) than the minimum value. It is possible that the MTU size of a router may be set to less than 576. It is noted that there is no clear standardized requirement on the minimum MTU size of IPv4, however, current practice in the Internet is that the minimum MTU is known to be 576 as stated in RFC 1122 section 3.3.3 “Fragmentation”. Furthermore, in RFC 1066 (IP MTU Discovery Options), for example, 576 bytes is also recommended as a default minimum MTU size. For the reasons explained above with respect to FIG. 5A, it is also desirable that the TU size increase be less than or equal to about 50% of the initial TU size. Furthermore, it is also desirable to repetitively test the path between the hosts 102, 104 with test packets of varying size to verify a previous result.

As depicted in FIG. 6, a host device 600 may include a processor 601 and a memory 602. The processor 602 may be a microprocessor or microcontroller chip of a type commonly used in electronic devices, e.g., a PIC microcontroller from Microchip Technology, Inc. of Chandler, Ariz. Alternatively, the processor 602 may be a parallel processor module, such as a Cell Processor. An example of a Cell Processor architecture is described in detail, e.g., in Cell Broadband Engine Architecture, copyright International Business Machines Corporation, Sony Computer Entertainment Incorporated, Toshiba Corporation Aug. 8, 2005 a copy of which may be downloaded at http://cell.scei.co.jp/, the entire contents of which are incorporated herein by reference. The memory 602 may be in the form of an integrated circuit, e.g., RAM, DRAM, ROM, and the like). A computer program 603 may be stored in the memory 602 in the form of processor readable instructions that can be executed on the processor 601. The instructions of the program 603 may include the steps of the method 200 as depicted in FIG. 4 and described herein. The device 600 may optionally include a control module 606. The control module may be mechanically mounted to or physically incorporated into the device 600. Alternatively the control module 606 may be a remote unit that interacts with the rest of the device 600 via a communication link, which may be a cable or wireless link.

The device 600 may also include well-known support functions 610, such as input/output (I/O) elements 611, power supplies (P/S) 612, a clock (CLK) 613 and cache 614. The device 600 may optionally include a mass storage device 615 such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data. The device 600 may also optionally include a display unit 616 and user interface unit 618 to facilitate interaction between the device 600 and a user. The support functions 610, mass storage 615, display 616 and user interface 618 may be coupled to the processor and/or memory by a data bus 620. The display unit 616 may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, graphical symbols or images. The user interface 618 may include a keyboard, mouse, joystick, light pen or other device. As shown the particular example depicted in FIG. 6, the user interface 618 may be incorporated into the control module 606. The device 600 may also include a network interface 622 to enable the device to communicate with other devices over a network 625, such as the internet. These components may be implemented in hardware, software or firmware or some combination of two or more of these. The host device 600 may send test data packets 624 and send and receive message packets 626 over the network 625. Each test packet may include a request REQ for a remotely located host to reply to the host device 600 if the test the remote host receives the test packet. The request REQ may also request the remote host to provide a data transfer rate for the test packet with the reply.

Embodiments of the present invention are related to a Path MTU discovery technique that does not depend on any requirement to the underlying protocol (e.g., DF bit, etc) and can effectively determine path MTU size.

While the above is a complete description of the preferred embodiment of the present invention, it is possible to use various alternatives, modifications and equivalents. Therefore, the scope of the present invention should be determined not with reference to the above description but should, instead, be determined with reference to the appended claims, along with their full scope of equivalents. Any feature described herein, whether preferred or not, may be combined with any other feature described herein, whether preferred or not. In the claims that follow, the indefinite article “A”, or “An” refers to a quantity of one or more of the item following the article, except where expressly stated otherwise. In the claims that follow, the expressions first and second are used to distinguish between different elements and do not imply any particular order or sequence. The appended claims are not to be interpreted as including means-plus-function limitations, unless such a limitation is explicitly recited in a given claim using the phrase “means for.”

Claims

1. A method for network communication between a first host and a second host connected by a network, the method comprising: a) sending a plurality of test packets of varying transmission unit (TU) size from the first host to the second host, wherein a “do not fragment” (DF) flag for the test packets is not set;b) determining whether one or more of the test packets were received by the second host;c) determining whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host;c′) calculating an estimated path MTU size for the network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited; andd) sending one or more message packets of a size less than or equal to the estimated path MTU size over the network path.
2. The method of claim 1, wherein a) comprises sending one or more test packets of increasing size compared to an initial TU size from the first host to the second host.
3. The method of claim 2 wherein an increase in size of the test packets is characterized by a TU size increase that is less than or equal to about 50% of the initial TU size.
4. The method of claim 2 wherein the initial TU size is 576 bytes.
5. The method of claim 1, wherein c) further comprises determining a data transfer rate of test packets received by the second host and wherein the one or more patterns include a behavior of the data transfer rate of the test packets received by the second host as a function of the TU size.
6. The method of claim 5 wherein determining the data transfer rate comprises including in the test packets a request for the second host to send a reply containing the data transfer rate and receiving the reply at the first host.
7. The method of claim 5, wherein c′) includes determining a behavior of the data transfer rate of the packets received by the second host as a function of TU size.
8. The method of claim 7, wherein c′) includes determining whether a drop in data transfer rate exhibits occurs above a particular TU size.
9. The method of claim 7, wherein c′) includes determining whether the data transfer rate tends to flatten out with increasing TU size.
10. The method of claim 1 wherein the plurality of test packets comprise a plurality of User Datagram Protocol (UDP) packets.
11. The method of claim 1, further comprising repetitively testing the path between the first and second host with packets of varying size to verify a previous result.
12. The method of claim 1 wherein a) comprises, sending one or more packets at an initial bandwidth and gradually increasing the bandwidth for subsequent packets until bandwidth saturation is detected.
13. The method of claim 1 wherein d) takes place by peer-to-peer connection between an application layer of a protocol stack on the first host and an application layer of a protocol stack on the second host.
14. The method of claim 1 wherein d) takes place by peer-to-peer connection between a transport layer of a protocol stack on the first host and a transport layer of a protocol stack on the second host.
15. An apparatus for discovery of a maximum transmission unit (MTU) size in a path between two nodes of a network, comprising: a first host having a computer with a computer readable program stored in a memory the wherein the program is configured to implement a method for discovery of a path MTU size between the two nodes, the method comprising:sending a plurality of test packets of varying transmission unit (TU) size from the first host to the second host;determining whether one or more of the test packets were received by the second host;determining whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host; andcalculating an estimated path MTU size for a network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited.
16. The apparatus of claim 15 wherein the first host includes a protocol stack having an application layer, wherein the program is configured to be implemented at the application layer.
17. A non-transitory computer readable medium encoded with a computer readable program configured to implement a method for discovery of a maximum transmission unit (MTU) size in a path between a first host and a second host connected by a network, the computer readable program comprising: a) an instruction that when executed causes the first host to send a plurality of test packets of varying transmission unit (TU) size from the first host to the second host; b) an instruction that when executed causes the first host to determine whether one or more of the test packets were received by the second host b′) an instruction, that when executed causes the first host to determine whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host; and c) an instruction that when executed causes the first host to calculate an estimated path MTU size for a network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited.

US Referenced Citations (260)

Number	Name	Date	Kind
4764928	Akerberg	Aug 1988	A
4787051	Olson	Nov 1988	A
4843568	Krueger	Jun 1989	A
5128671	Thomas, Jr.	Jul 1992	A
5528265	Harrison	Jun 1996	A
5544325	Denny et al.	Aug 1996	A
5596720	Hamada et al.	Jan 1997	A
5630184	Roper et al.	May 1997	A
5636216	Fox et al.	Jun 1997	A
5701427	Lathrop	Dec 1997	A
5768382	Schneier et al.	Jun 1998	A
5768531	Lin	Jun 1998	A
5793763	Mayes et al.	Aug 1998	A
5809016	Kreitzer et al.	Sep 1998	A
5812531	Cheung et al.	Sep 1998	A
5835726	Shwed et al.	Nov 1998	A
5856972	Riley et al.	Jan 1999	A
5898679	Brederveld et al.	Apr 1999	A
5956485	Perlman	Sep 1999	A
6012096	Link et al.	Jan 2000	A
6058431	Srisuresh et al.	May 2000	A
6128623	Mattis et al.	Oct 2000	A
6128624	Papierniak et al.	Oct 2000	A
6128627	Mattis et al.	Oct 2000	A
6128664	Yanagidate et al.	Oct 2000	A
6151584	Papierniak et al.	Nov 2000	A
6151601	Papierniak et al.	Nov 2000	A
6152824	Rothschild et al.	Nov 2000	A
6157368	Faeger	Dec 2000	A
6292834	Ravi et al.	Jan 2001	B1
6208649	Kloth	Mar 2001	B1
6209003	Mattis et al.	Mar 2001	B1
6212565	Gupta	Apr 2001	B1
6212633	Levy et al.	Apr 2001	B1
6289358	Mattis et al.	Sep 2001	B1
6292880	Mattis et al.	Sep 2001	B1
6327630	Carroll et al.	Dec 2001	B1
6333931	LaPier et al.	Dec 2001	B1
6349210	Li	Feb 2002	B1
6353891	Borella et al.	Mar 2002	B1
6375572	Masuyama	Apr 2002	B1
6389462	Cohen et al.	May 2002	B1
6393488	Araujo	May 2002	B1
6405104	Dougherty	Jun 2002	B1
6421347	Borgstahl et al.	Jul 2002	B1
6487583	Harvey et al.	Nov 2002	B1
6487600	Lynch	Nov 2002	B1
6535511	Rao	Mar 2003	B1
6549786	Cheung et al.	Apr 2003	B2
6553515	Gross et al.	Apr 2003	B1
6560636	Cohen	May 2003	B2
6581108	Denison et al.	Jun 2003	B1
6590865	Ibaraki et al.	Jul 2003	B1
6616531	Mullins	Sep 2003	B1
6618757	Babbitt et al.	Sep 2003	B1
6636898	Ludovici et al.	Oct 2003	B1
6640241	Ozzie et al.	Oct 2003	B1
6641241	Ozzie et al.	Oct 2003	B2
6641481	Mai et al.	Nov 2003	B1
6661799	Molitor	Dec 2003	B1
6667972	Foltan et al.	Dec 2003	B1
6668283	Sitaraman et al.	Dec 2003	B1
6690678	Basso et al.	Feb 2004	B1
6701344	Holt et al.	Mar 2004	B1
6704574	Lin	Mar 2004	B2
6712697	Acres	Mar 2004	B2
6757255	Aoki et al.	Jun 2004	B1
6760775	Anerousis et al.	Jul 2004	B1
6772219	Shobatake	Aug 2004	B1
6779017	Lamberton et al.	Aug 2004	B1
6779035	Gbadegesin	Aug 2004	B1
6789126	Saulpaugh et al.	Sep 2004	B1
6799255	Blumenau et al.	Sep 2004	B1
6807575	Emaru et al.	Oct 2004	B1
6816703	Wood et al.	Nov 2004	B1
6829634	Holt et al.	Dec 2004	B1
6848997	Hashimoto et al.	Feb 2005	B1
6891801	Herzog	May 2005	B1
6899628	Leen et al.	May 2005	B2
6920501	Chu et al.	Jul 2005	B2
6934745	Krautkremer	Aug 2005	B2
6978294	Adams et al.	Dec 2005	B1
7000025	Wilson	Feb 2006	B1
7016942	Odom	Mar 2006	B1
7017138	Zirojevic et al.	Mar 2006	B2
7035911	Lowrey et al.	Apr 2006	B2
7043641	Martinek et al.	May 2006	B1
7065579	Traversat et al.	Jun 2006	B2
7082316	Eiden et al.	Jul 2006	B2
7096006	Lai et al.	Aug 2006	B2
7107348	Shimada et al.	Sep 2006	B2
7120429	Minear et al.	Oct 2006	B2
7123608	Scott et al.	Oct 2006	B1
7127613	Pabla et al.	Oct 2006	B2
7130921	Goodman et al.	Oct 2006	B2
7133368	Zhang et al.	Nov 2006	B2
7134961	Hora	Nov 2006	B2
7155515	Brown et al.	Dec 2006	B1
7155518	Forslow	Dec 2006	B2
7168089	Nguyen et al.	Jan 2007	B2
7174382	Ramanathan et al.	Feb 2007	B2
7177950	Narayan et al.	Feb 2007	B2
7177951	Dykeman et al.	Feb 2007	B1
7185138	Galicki	Feb 2007	B1
7194654	Wray et al.	Mar 2007	B2
7197565	Abdelaziz et al.	Mar 2007	B2
7203841	Jackson et al.	Apr 2007	B2
7216359	Katz et al.	May 2007	B2
7240093	Danieli et al.	Jul 2007	B1
7243141	Harris	Jul 2007	B2
7254709	Richard	Aug 2007	B1
7263070	Delker et al.	Aug 2007	B1
7272636	Pabla	Sep 2007	B2
7321928	Feltin et al.	Jan 2008	B2
7340500	Traversat et al.	Mar 2008	B2
7346015	Shipman	Mar 2008	B2
7392375	Bartram et al.	Jun 2008	B2
7398388	Xu et al.	Jul 2008	B2
7407434	Thomas et al.	Aug 2008	B2
7426185	Musacchio et al.	Sep 2008	B1
7429215	Rozkin et al.	Sep 2008	B2
7451490	Pirich et al.	Nov 2008	B2
7457279	Scott et al.	Nov 2008	B1
7489631	Ilan	Feb 2009	B2
7529193	Zimmerman et al.	May 2009	B2
7533172	Traversat et al.	May 2009	B2
7573886	Ono	Aug 2009	B1
7680047	Vadlakonda et al.	Mar 2010	B2
7788354	Nag	Aug 2010	B2
7792902	Chatani	Sep 2010	B2
7803052	Multerer et al.	Sep 2010	B2
7831666	Chatani	Nov 2010	B2
7859998	Wade et al.	Dec 2010	B2
20010005368	Rune	Jun 2001	A1
20010017856	Asokan et al.	Aug 2001	A1
20010021188	Fujimori et al.	Sep 2001	A1
20010044339	Cordero et al.	Nov 2001	A1
20010046213	Sakoda	Nov 2001	A1
20020002074	White et al.	Jan 2002	A1
20020006114	Bjelland et al.	Jan 2002	A1
20020013838	Kushida et al.	Jan 2002	A1
20020016826	Johansson et al.	Feb 2002	A1
20020035604	Cohen et al.	Mar 2002	A1
20020049841	Johnson et al.	Apr 2002	A1
20020055989	Stringer-Calvert et al.	May 2002	A1
20020075844	Hagen	Jun 2002	A1
20020085097	Colmenarez et al.	Jul 2002	A1
20020097732	Worster et al.	Jul 2002	A1
20020107786	Lehmann-Haupt et al.	Aug 2002	A1
20020107935	Lowery et al.	Aug 2002	A1
20020119821	Sen et al.	Aug 2002	A1
20020138471	Dutta et al.	Sep 2002	A1
20020143855	Traversat et al.	Oct 2002	A1
20020147810	Traversat et al.	Oct 2002	A1
20020161821	Narayan et al.	Oct 2002	A1
20020183004	Fulton et al.	Dec 2002	A1
20020184310	Traversat et al.	Dec 2002	A1
20020184311	Traversat et al.	Dec 2002	A1
20030027634	Matthews, III	Feb 2003	A1
20030028585	Yeager et al.	Feb 2003	A1
20030045359	Leen et al.	Mar 2003	A1
20030046292	Subramanian et al.	Mar 2003	A1
20030051052	Shteyn et al.	Mar 2003	A1
20030055892	Huitema et al.	Mar 2003	A1
20030055978	Collins	Mar 2003	A1
20030079003	Burr	Apr 2003	A1
20030084282	Taruguchi	May 2003	A1
20030097408	Kageyama et al.	May 2003	A1
20030104829	Alzoubi et al.	Jun 2003	A1
20030115258	Baumeister et al.	Jun 2003	A1
20030126229	Kantor et al.	Jul 2003	A1
20030126245	Feltin et al.	Jul 2003	A1
20030135625	Fontes et al.	Jul 2003	A1
20030152034	Zhang et al.	Aug 2003	A1
20030158961	Nomura et al.	Aug 2003	A1
20030162556	Libes	Aug 2003	A1
20030177187	Levine et al.	Sep 2003	A1
20030182421	Faybishenko et al.	Sep 2003	A1
20030182428	Li et al.	Sep 2003	A1
20030191828	Ramanathan et al.	Oct 2003	A1
20030217096	McKelvie et al.	Nov 2003	A1
20030217135	Chatani et al.	Nov 2003	A1
20030227939	Yukie et al.	Dec 2003	A1
20030229779	Morais et al.	Dec 2003	A1
20030229789	Morais et al.	Dec 2003	A1
20030233281	Takeuchi et al.	Dec 2003	A1
20040007618	Oram et al.	Jan 2004	A1
20040015548	Lee	Jan 2004	A1
20040018839	Andric et al.	Jan 2004	A1
20040024879	Dingman et al.	Feb 2004	A1
20040063497	Gould	Apr 2004	A1
20040085947	Ekberg et al.	May 2004	A1
20040087369	Tanaka	May 2004	A1
20040088369	Yeager et al.	May 2004	A1
20040103179	Damm et al.	May 2004	A1
20040110563	Tanaka	Jun 2004	A1
20040133631	Hagen et al.	Jul 2004	A1
20040139228	Takeda et al.	Jul 2004	A1
20040162871	Pabla et al.	Aug 2004	A1
20040181463	Goldthwaite et al.	Sep 2004	A1
20040207880	Thakur	Oct 2004	A1
20040212589	Hall et al.	Oct 2004	A1
20040236863	Shen et al.	Nov 2004	A1
20040236945	Risan et al.	Nov 2004	A1
20040243665	Markki et al.	Dec 2004	A1
20040249891	Khartabil et al.	Dec 2004	A1
20040254977	Zhang	Dec 2004	A1
20040267876	Kakivaya et al.	Dec 2004	A1
20050007964	Falco et al.	Jan 2005	A1
20050015626	Chasin	Jan 2005	A1
20050020354	Nguyen et al.	Jan 2005	A1
20050026698	Pirich et al.	Feb 2005	A1
20050063409	Oommen	Mar 2005	A1
20050064939	McSheffrey et al.	Mar 2005	A1
20050065632	Douglis et al.	Mar 2005	A1
20050074007	Samuels et al.	Apr 2005	A1
20050080858	Pessach	Apr 2005	A1
20050086287	Datta	Apr 2005	A1
20050086288	Datta et al.	Apr 2005	A1
20050086329	Datta et al.	Apr 2005	A1
20050086350	Mai	Apr 2005	A1
20050086369	Mai et al.	Apr 2005	A1
20050105526	Stiemerling et al.	May 2005	A1
20050141522	Kadar et al.	Jun 2005	A1
20050149481	Hesselink et al.	Jul 2005	A1
20050221858	Hoddie	Oct 2005	A1
20050250487	Miwa	Nov 2005	A1
20050251577	Guo et al.	Nov 2005	A1
20050259637	Chu et al.	Nov 2005	A1
20050262411	Vertes	Nov 2005	A1
20060063587	Manzo	Mar 2006	A1
20060067920	Jensen	Mar 2006	A1
20060068702	Miwa	Mar 2006	A1
20060075127	Juncker et al.	Apr 2006	A1
20060084504	Chan et al.	Apr 2006	A1
20060111979	Chu	May 2006	A1
20060114918	Ikeda et al.	Jun 2006	A1
20060209822	Hamamoto	Sep 2006	A1
20060218624	Ravikumar et al.	Sep 2006	A1
20060288103	Gobara et al.	Dec 2006	A1
20070058792	Chaudhari et al.	Mar 2007	A1
20070061460	Khan et al.	Mar 2007	A1
20070077981	Hungate et al.	Apr 2007	A1
20070081459	Segel et al.	Apr 2007	A1
20070115963	Vadlakonda et al.	May 2007	A1
20070150552	Harris et al.	Jun 2007	A1
20070165629	Chaturvedi et al.	Jul 2007	A1
20070191109	Crowder et al.	Aug 2007	A1
20070198418	MacDonald et al.	Aug 2007	A1
20070208748	Li	Sep 2007	A1
20070213124	Walker et al.	Sep 2007	A1
20070217436	Markley et al.	Sep 2007	A1
20070237153	Slaughter et al.	Oct 2007	A1
20090077245	Smelyansky et al.	Mar 2009	A1
20090094370	Jacob et al.	Apr 2009	A1
20090111532	Slaughter et al.	Apr 2009	A1
20090138610	Gobara et al.	May 2009	A1
20090228593	Takeda	Sep 2009	A1
20090240821	Juncker et al.	Sep 2009	A1
20110035501	Takeda	Feb 2011	A1

Foreign Referenced Citations (22)

Number	Date	Country
0 913 965	May 1999	EP
1 107 508	Jun 2001	EP
1 374 959	May 2003	EP
2001 53901	Feb 2001	JP
2002 10321	Jan 2002	JP
2004 135778	May 2004	JP
2004 136009	May 2004	JP
2004 141225	May 2004	JP
2005 319047	Nov 2005	JP
2005 323116	Nov 2005	JP
2005 323117	Nov 2005	JP
WO 9935799	Jul 1999	WO
WO 0197485	Dec 2001	WO
WO 0203217	Jan 2002	WO
0211366	Feb 2002	WO
WO 0223822	Mar 2002	WO
03069495	Aug 2003	WO
2004038541	May 2004	WO
WO2004063843	Jul 2004	WO
WO2005088466	Sep 2005	WO
WO2007041417	Apr 2007	WO
WO 2009073312	Jun 2009	WO

Related Publications (1)

	Number	Date	Country
	20080298376 A1	Dec 2008	US

Network communication with path MTU size discovery

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (260)

Foreign Referenced Citations (22)

Related Publications (1)