Network communication with path MTU size discovery

Information

  • Patent Grant
  • 7995478
  • Patent Number
    7,995,478
  • Date Filed
    Wednesday, May 30, 2007
    17 years ago
  • Date Issued
    Tuesday, August 9, 2011
    13 years ago
Abstract
A method for network communication and an apparatus for discovery of a maximum transmission unit (MTU) size in a path between two nodes of a network are disclosed. A plurality of test packets of varying transmission unit (TU) size may be sent from the first host to the second host. A “do not fragment” (DF) flag for the test packets is not set. It is determined whether one or more of the test packets were received by the second host. An estimated path MTU for a network path between the first and second hosts size may then be calculated based on one or more patterns of receipt of the test packets by the second host. Once the estimated Path MTU size has been determined, message packets of a size less than or equal to the Path MTU size may be sent over the network path.
Description
FIELD OF THE INVENTION

This invention generally relates to network communications and more particularly to discovery of a maximum transmission unit (MTU) size in a path between two nodes of a network.


BACKGROUND OF THE INVENTION

When one Internet Protocol (IP) host has a large amount of data to send to another host, the data is transmitted as a series of Internet Protocol (IP) datagrams. It is often preferable that these datagrams be of a largest size that does not require fragmentation anywhere along the path from the source to the destination. This datagram size is referred to as the Maximum Transmission unit (MTU) for the path and is sometimes referred to as the Path MTU or PMTU. The Path MTU is equal to the minimum of the MTUs of each hop in the path.


Fragmenting a packet involves dividing the packet into smaller packets and adding a header to each smaller packet. Since each fragment has the same header overhead as the original message, fragmenting packets adds to the total number of bytes that need to be transmitted in order to transmit the message. This can slow down transmission. It is therefore advantageous to discover Path MTU in order to avoid fragmenting packets.


A shortcoming of the prior art is the lack of an adequate mechanism for discovering the MTU of an arbitrary path between two hosts. Prior art techniques for Path MTU discovery are described e.g., in RFC 1191, “Path MTU Discovery”, by J. Mogul and S. Deering, which is available on the Internet at http://www.ietf.org/rfc/rfc1191.txt?number=1191, the contents of which are incorporated herein by reference. RFC 1191 describes a technique for Path MTU discovery by setting the “do not fragment” (DF) flag on packets sent by the host. If a router in the path has an MTU size smaller than the packet size, an Internet Control Message Protocol (ICMP) error is returned and the packet is dropped. Otherwise, the packet is received by the intended recipient, which verifies receipt of the packet. Unfortunately, administrative privilege is often required in order to be able to set the DF flag. In addition, not all routers are configured to provide the ICMP messages that are relied upon in this technique. In fact, most routers are not so configured.


Additional prior art path MTU discovery techniques are described by M. Mathis and J. Heffner in an internet draft titled “Packetization Layer Path MTU Discovery”, a copy of which is available on the internet at: <http://www.ietf.org:80/rfc/rfc4821.txt?number=4821>, the contents of which are incorporated herein by reference.


This RFC addresses issues with classic Path MTU discovery, which include “ICMP black holes” and ICMP blockage by firewalls. However, Packetization Layer Path MTU Discovery (PLPMTUD) technique still has a number of drawbacks. For example, PLPMTUD techniques must be able to set the do not fragment (DF) bit to 1 for packet loss detection. Unfortunately, the DF bit cannot be controlled from applications. In addition, PLPMTUD needs to be supported by both IP layer and the TCP/IP layer to work, and is not yet widely implemented.


It is within this context that embodiments of the present invention arise.





BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention may be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:



FIG. 1 is a block diagram illustrating a network path between two hosts.



FIG. 2 is a block diagram illustrating the protocol stacks in the hosts and routers of FIG. 1.



FIG. 3A is a graph illustrating the data transmission rate of a packet rate limited router as a function of packet size.



FIG. 3B is a graph illustrating the data transmission rate of a bit rate limited router as a function of packet size.



FIG. 4 is a flow diagram of a method for Path MTU discovery according to an embodiment of the present invention.



FIG. 5A is a graph illustrating an example of a false detection of a packet rate limited router.



FIG. 5B is a graph illustrating an example of a false detection of a bit rate limited router.



FIG. 6 is a schematic diagram of an apparatus for path MTU discovery according to an embodiment of the present invention.





DESCRIPTION OF THE SPECIFIC EMBODIMENTS

Although the following detailed description contains many specific details for the purposes of illustration, anyone of ordinary skill in the art will appreciate that many variations and alterations to the following details are within the scope of the invention. Accordingly, the exemplary embodiments of the invention described below are set forth without any loss of generality to, and without imposing limitations upon, the claimed invention.


Embodiments of the present invention are directed to methods and apparatus for discovery of a maximum transmission unit (MTU) size in a path between a first host and a second host connected by a network. A plurality of test packets of varying transmission unit (TU) size may be sent from the first host to the second host. A “do not fragment” (DF) flag for the test packets is not set. It is determined whether one or more of the test packets were received by the second host. An estimated path MTU size may then be calculated based on one or more patterns of receipt of the test packets by the second host.


TECHNICAL BACKGROUND

Embodiments of the present invention may be understood in the context of network communications. FIG. 1 illustrates an example of network communication between Host 1102 and Host 2104. By way of example, the hosts may be any network capable device. Such devices include, but are not limited to computers, hand held internet browsers and/or email devices, Voice over Internet Protocol (VoIP) phones, video game consoles, hand held video game devices, and the like. Messages from Host 1 travel to Host 2 over a network path 103 via routers 106, 108, and 110. Each router may have a different Maximum Transmission Unit (MTU). In this example, router 106 has an MTU of 1500 bytes, router 108 has an MTU of 1000 bytes and router 110 has an MTU of 1500 bytes. The path MTU for the path 103 is the smallest MTU of any router in the path, which is 1000 bytes in this example.


The Hosts 102, 104 and routers 106, 108, 110 may be configured to communicate with each other according to a network protocol. FIG. 2 illustrates an example of a network protocol configuration for the situation shown in FIG. 1. By way of example, each host device 102, 104 may be configured (either in software or hardware or some combination of both) with a network protocol stack having five layers: an Application layer APP, a Transport layer TRANS, a Network layer NET (sometimes referred to as the IP layer), a Data Link Layer DLL and a Physical layer PHYS. These layers are well-known to those of skill in the art.


The Hosts 102, 104 typically implement all five layers. The routers 106, 108, 110 typically implement only the Network, Data Link and Physical layers.


By way of example, embodiments of the present invention may implement Path MTU discovery at the Application layer. Typically, the Transport layer and below are implemented in an operating system (OS) kernel and applications have no control in changing behavior at these layers. Classic PMTUD, by contrast, is typically implemented at the Transport and IP (Network) layers.


The Application layer APP represents the level at which applications access network services. This layer represents the services that directly support applications such as software for file transfers, database access, and electronic mail. Examples of application layer software include HL7, Modbus, Session Initiation Protocol (SIP), and Simple Sensor Interface Protocol (SSI). In the particular case of the TCP/IP suite, the Application layer APP may be implemented with software protocols such as Hypertext Transfer Protocol (HTTP), Session Initiation Protocol (SIP), Simple Mail Transfer Protocol (SMTP), Short Message Peer-to-Peer Protocol (SMPP), Simple Network Management Protocol (SNMP), File Transfer Protocol (FTP), Teletype Network (TELNET), Network File System (NFS), Network Time Protocol (NTP), Real-time Transport Protocol (RTP), Dynamic Host Configuration Protocol (DHCP), and Domain Name System (DNS). The Application layer APP may sometimes be divided further into a Presentation layer and a Session layer, e.g., in the Open Systems Interface (OSI) protocol. The Presentation layer translates data from the Application layer into an intermediary format. The Presentation layer may also manages security issues by providing services such as data encryption, and compresses data so that fewer bits need to be transferred on the network. The Session layer allows two applications on different computers to establish, use, and end a session. The Session layer may establish dialog control between the two computers in a session, regulating which side transmits, plus when and how long it transmits.


The Transport layer TRANS handles error recognition and recovery. For a transmitting host, the Transport layer may also repackage long messages when necessary into small packets for transmission. For a receiving host the Transport layer rebuilds packets into the original message. The Transport layer for a receiving host may also send receipt acknowledgments. Examples of particular Transport layer protocols include Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Stream Control Transmission Protocol (SCTP), all of which, and equivalents thereof, are well-known to those of skill in the art. The Transport layer TRANS is the layer that typically supports packet fragmentation. It is noted that fragmentation may take place in the Transport layer of the host originating a message or at the Transport layer of any of the routers along the path between that host and the message's intended recipient.


The Network layer NET addresses messages and translates logical addresses and names into physical addresses. It also determines the route from the source to the destination computer. The Network layer may also manages traffic problems, such as switching, routing, and controlling the congestion of data packets. Examples of particular Network layer protocols include, but are not limited to, Internet Protocol (IP), Internet Control Message Protocol (ICMP), IP Security (Ipsec), Address Resolution Protocol (ARP), Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) all of which, and equivalents thereof, are well-known to those of skill in the art.


The Data Link layer DLL packages raw bits from the Physical layer PHYS into frames (logical, structured packets for data). The Data Link layer may also be responsible for transferring frames from one computer to another, without errors. After sending a frame, the Data Link layer DLL waits for an acknowledgment from the receiving computer. Examples of particular Data Link layer protocols include, but are not limited to, Point-to-Point Protocol (PPP), Serial Line Internet Protocol (SLIP) and Media Access Control (MAC) all of which, and equivalents thereof, are well-known to those of skill in the art. The Data Link layer DLL typically limits the MTU size.


The Physical layer PHYS transmits bits from one computer to another and regulates the transmission of a stream of bits over a physical medium. This layer defines how the cable is attached to the network adapter and what transmission technique is used to send data over the cable. Examples of particular Physical layer protocols and standards include, but are not limited to, RS-232, V.35, V.34, I.430, I.431, T1, E1, 10BASE-T, 100BASE-TX, POTS, SONET, DSL, 802.11a, 802.11b, 802.11g, 802.11n all of which, and equivalents thereof, are well-known to those of skill in the art.


A message originating at Host 1102 starts at the Application layer APP and works its way down the protocol stack to the Physical layer PHYS. When the message arrives as Host 2104, it is received at the Physical layer PHYS and works its way up the stack to the Application layer APP. In the path 103 between the two hosts 102, 104, the message is received at the Physical layer PHYS of router 106 and works its way up to the Transport layer TRANS and then back down the stack to the Physical layer PHYS for transmission to router 108. The process repeats for routers 108 and 110. In peer-to-peer situations, once a connection has been established between the hosts 102, 104 they may communicate directly by peer-to-peer connections 105, e.g., at the Application layer APP or at the Transport layer TRANS.


Path MTU Discovery Method and Apparatus


By way of example, embodiments of invention may be applied to discovery of MTU size defined at the IP (Network) layer. Alternatively, MTU size discovery as described herein may be equally applied to any supported transport protocol.


According to embodiments of the present invention, Path MTU discovery may be based on two observations. The first observation is that most routers will properly fragment packets that conform to certain Transport Layer protocols. An example of such a protocol is the Uniform Datagram Protocol (UDP). UDP is a minimal message-oriented transport layer protocol that is described, e.g., by J. Postel in IETF RFC 768, Aug. 28, 1980, which may be accessed on the Internet at http://tools.ietf.org/html/rfc768, the contents of which are incorporated herein by reference. In the Internet protocol (IP) suite, UDP may provide a very simple interface between a network layer below (e.g., IPv4) and a session layer or application layer above. UDP is often described as being a connectionless protocol. As used herein connectionless, refers to network protocols in which a host can send a message without establishing a connection with the recipient. That is, the host simply puts the message onto the network with the destination address and hopes that it arrives. Other examples of connectionless protocols include Ethernet and IPX. UDP is typically used for message broadcast (sending a message to all on a local network) or multicast (sending a message to all subscribers). Common network applications that use UDP include the Domain Name System (DNS), streaming media applications such as Internet Protocol Television (IPTV), Voice over IP (VoIP), Trivial File Transfer Protocol (TFTP) and online games.


The second observation is that routers tend to exhibit one or two particular types of bandwidth limitation behavior. Specifically, router bandwidth limitation may be classified as being either packet rate limited or bit rate limited. In a packet rate limited router, the data transmission rate is determined by a number of packets the router can transmit per unit time. For a packet rate limited router, the size of the packets does not affect the number of packets the router can send per unit time as long as the packets are no larger than some maximum packet size, which determines the MTU for that router. Packet rate limited routers are sometimes referred to herein as being packet-per-second (pps) limited. For a pps limited router, it makes sense to send packets that are as large as possible in order to optimize the data transmission rate. For a bit rate limited router, by contrast, the data transmission rate is determined by a maximum number of bits per unit time that is independent of the packet size. Bit-rate limited routers are sometimes referred to herein as being bit-per-second (bps) limited. It is noted that both bps-limited routers and pps-limited routers may fragment a packet depending on the MTU set to the router.


The difference in behavior of the packet rate limited and bit rate limited routers is illustrated in FIGS. 3A-3B. Specifically, FIG. 3A graphically depicts the data transfer rate for UDP packets as a function of transmission unit TU size for a packet rate limited router. Packets at an initial size are sent at an initial bandwidth BW0 (e.g., 64 Kilobits per second (Kbps)). Preferably the sending host has the ability to “throttle” the bandwidth with which the packets are sent. Such a “slow-start” approach is often useful since packets are queued at each node. A long queue increases latency, which is undesirable. Long queues also tend to take a long time to be recovered. Embodiments of the present invention avoid this by adjusting the sending bandwidth BW while keeping the TU size fixed. Each packet includes a request for the receiving host to provide the data transfer rate (e.g., in bits per second (bps) for the received packets. As the bandwidth is increased, the data transfer rate for the received packets will continue to increase until the bandwidth reaches a packet-limit saturation. At this point, increasing the bandwidth does not further increase the data transfer rate for the packets since the router transmits a fixed number of packets per second. For packets that are smaller than the router's MTU, the packet-limit saturated data transfer rate increases approximately linearly as the packet size increases as indicated by the line 302. For example, if the path contains a router having a packet limit of 32 packets per second and an initial packet size of, e.g., 480 8-bit bytes, the data transfer rate for the packets will saturate at about 120 Kbps. If the packet size is increased by 50%, e.g., to 720 bytes, but remains below the MTU size for the router, the bandwidth will saturate at about 180 Kbps. Such linear behavior is characteristic of a pps-limited router. Packets that are greater than the MTU size for the router are fragmented into two or more packets. As a result, the number of packets increases but the packet transmission rate does not. Consequently, the data transmission rate abruptly drops just beyond the MTU size. If the packet size is again increased, the data transmission rate for a pps-limited router is expected to increase in an approximately linear fashion until the packet size reaches another integer multiple of the MTU size.


Rate limitation, which may occur either intentionally or unintentionally, could happen at any layer in the protocol stack. One “intentional” case that is very common is to set up IP tables (set policies within the IP and transport layers) to throttle bandwidth. Bandwidth saturation may be detected at the receiver side by observing packet loss and increase of latency. As described above, there are a series of queues in the path. When saturation occurs somewhere in the path, a queue right before the saturation point starts accumulating packets. This may be observed as an “increase of latency” at the receiver by checking timestamps added to each packet. Eventually, the queue becomes full and packets start being dropped, which may also be observed at the receiver side by checking sequence numbers attached to each packet.



FIG. 3B graphically depicts the data transfer rate for UDP packets as a function of packet size for a bit rate limited router. It is noted that a bit rate limited router is generally not sensitive to fragmentation of the packets because it is not affected by the number of packets sent per second. For example, sending 1000 bytes/sec or 500 byte packets at 2 packets/sec is the same for a bit rate limited router. However, although the bandwidth may be more or less fixed for such a router, the data transfer rate (e.g., in bits per second) may vary due to a more or less constant latency associated with transmission of each packet. As a result of the latency, the data transfer rate for a bps-limited router will tend to increase with packet size. However, as the data transmission rate approaches a bandwidth limit BWL, the transmission rate will tend to flatten off as a function of packet size.


Thus, based on an understanding of the two types of router behavior illustrated in FIGS. 3A-3B, path MTU discovery may proceed according to a method 200 as shown in FIG. 4. As indicated at 202, test packets may be transmitted from one host to a recipient (e.g., from host 102 to host 104, with a small initial TU size and a small initial transmission bandwidth BW. (See FIG. 1). The DF flag for these packets is not set so that routers along the path 103 may fragment the packets normally, if they are configured to do so. As the packets are received, the transmitting host determines the data transfer rate for the packets, as indicated at 204. By way of example, each packet may include a request for the receiving host to send back a message that indicates the data transfer rate R for the test packet. The sending host probes for saturation behavior at 206. If saturation is not observed, the transmission bandwidth BW is gradually increased with the same TU size at 208, while probing packet loss and growth of delay at receiver side. When significant packet loss or growth of delay is detected, it may be assumed that the bandwidth with the TU size is saturated. The values of TU and R may be recorded at this point as indicated at 210. The TU size may then be increased, e.g., by 50% of the initial TU size. If the bandwidth is pps limited, it is expected that the bandwidth will grow linearly with TU size until the TU size (or an integer multiple thereof) is reached. If the TU size exceeds the actual path MTU size and the bandwidth is pps-limited, the receiver will detect that the data transfer rate is less than for the previous TU size. The example in FIG. 5A shows how the data transfer rate may behave when the TU size exceeds the actual path MTU size. When TU size exceeds the path MTU size, an intermediary node that has the MTU size set will start fragmenting long packets to fit them into the MTU size. This causes an increase in the number of packets, and a consequent decrease in the transfer rate since the bandwidth is pps limited. Specifically, just above the MTU size, the packets are split into two, which results in a drop in data transfer rate by one half. Just above twice the MTU size the packets are split into three, which results in a drop in data transfer rate by one third. Just above three times the MTU size the packets are split into four, which results in a drop in data transfer rate by one fourth. By detecting this bandwidth drop, network applications can detect the path MTU size to maximize available bandwidth.


If the bandwidth is bps limited, by contrast, the bandwidth will tend to grow until it reaches a bandwidth saturation level, e.g., as shown in FIG. 3B. The data transfer rate for a bps-limited tends to flatten off without the characteristic drops seen in FIG. 5A. Thus, it is possible to determine router behavior and path MTU size by observing the dependence of data transfer rate R on TU size. By way of example, after each R and TU value have been recorded at 210 the sending host may check at 212 to determine if the latest value of R is less than the previous value of R. If so, the path MTU may be determined from the behavior of R versus TU at 214 based on a packet-rate limit assumption. If saturation of R as a function of TU (e.g., as shown in FIG. 3B) is detected at 216, the path MTU may be calculated based on a bit-rate limit assumption at 218. If such saturation behavior is not detected, the TU may be increased at 220 and the process may repeat at 202, 204, 206, etc. Once the Path MTU size has been determined, message packets of a size less than or equal to the Path MTU size may be sent over the path 103 to the second host 104, as indicated at 222. It is noted that the host that performs the path MTU discovery need not be one that sends the message packets. For example, if two or more hosts are connected to the second host 104 by the same path 103 it is possible for one of these hosts to perform path MTU discovery and notify another of these host of the path MTU. Any of these hosts may then send message packets that are less than or equal to the path MTU size over the path 103.


It is important that the initial TU size and the increase in TU size be chosen carefully in order to detect the drop in data transfer rate. For example, if the TU size increase is 100% (doubled), the receiver side may not detect enough of a bandwidth drop to be confident that the TU size exceeded the actual path MTU size. FIG. 5A illustrates an example of the problem. Suppose the path is packet rate limited with a path MTU (PMTU) is about 576 bytes and the initial TU size is 480 bytes. If the TU size increases by 480 bytes each time, the drops in transfer rate at PMTU, 2×PMTU and 3×PMTU might not be observed. The measured data transfer rate may be the same at TU size 480 as for 960 and 1440. Such behavior (represented by the dashed curve 502) might be misinterpreted as a characteristic of a bit-rate limited path. Thus, too large an increase in TU size can lead to an erroneous determination of bit-rate limited path behavior. If, instead, the TU size increases by 50% of the initial TU size, the pattern of measurements at TU sizes of 480, 720, 960, 1200 and 1440 bytes reveals the bandwidth drop at PMTU, 2×PMTU, 3×PMTU, etc., as represented by the solid curve 504, behavior which is associated with a packet rate limited path.



FIG. 5B illustrates an example of a possible erroneous determination of packet rate limited behavior. It is possible that the bit-rate limit for a pps-limited router may change over time. If, for example, there is a sudden drop in the bit-rate after an initial measurement at a TU size of 480 bytes, the actual pps-limited router behavior, represented by the solid curve 512, may exhibit a sharp drop in data transfer rate followed by a recovery to a lower value. In such a case, it is possible that subsequent measurements taken at 720-byte and 960-byte TU sizes may be misinterpreted as being characteristic of packet rate limited behavior, represented by the dashed lines 514. Such an erroneous determination may be avoided, e.g., by sending out test packets at sufficiently short intervals of time that the bit rate doesn't change dramatically during the interval. In addition, it may be useful to send out packets over a sufficiently broad range of TU sizes that the true behavior may be determined.


From FIGS. 5A-5B it may be seen that “false positives” may be avoided if the initial TU size and TU size increase are not too large. For most, if not all, routers the MTU size is greater than or equal to some minimum value, e.g., 576 bytes. Thus, it is desirable to choose the initial TU size to be slightly less (e.g., about 15-20% less) than the minimum value. It is possible that the MTU size of a router may be set to less than 576. It is noted that there is no clear standardized requirement on the minimum MTU size of IPv4, however, current practice in the Internet is that the minimum MTU is known to be 576 as stated in RFC 1122 section 3.3.3 “Fragmentation”. Furthermore, in RFC 1066 (IP MTU Discovery Options), for example, 576 bytes is also recommended as a default minimum MTU size. For the reasons explained above with respect to FIG. 5A, it is also desirable that the TU size increase be less than or equal to about 50% of the initial TU size. Furthermore, it is also desirable to repetitively test the path between the hosts 102, 104 with test packets of varying size to verify a previous result.


As depicted in FIG. 6, a host device 600 may include a processor 601 and a memory 602. The processor 602 may be a microprocessor or microcontroller chip of a type commonly used in electronic devices, e.g., a PIC microcontroller from Microchip Technology, Inc. of Chandler, Ariz. Alternatively, the processor 602 may be a parallel processor module, such as a Cell Processor. An example of a Cell Processor architecture is described in detail, e.g., in Cell Broadband Engine Architecture, copyright International Business Machines Corporation, Sony Computer Entertainment Incorporated, Toshiba Corporation Aug. 8, 2005 a copy of which may be downloaded at http://cell.scei.co.jp/, the entire contents of which are incorporated herein by reference. The memory 602 may be in the form of an integrated circuit, e.g., RAM, DRAM, ROM, and the like). A computer program 603 may be stored in the memory 602 in the form of processor readable instructions that can be executed on the processor 601. The instructions of the program 603 may include the steps of the method 200 as depicted in FIG. 4 and described herein. The device 600 may optionally include a control module 606. The control module may be mechanically mounted to or physically incorporated into the device 600. Alternatively the control module 606 may be a remote unit that interacts with the rest of the device 600 via a communication link, which may be a cable or wireless link.


The device 600 may also include well-known support functions 610, such as input/output (I/O) elements 611, power supplies (P/S) 612, a clock (CLK) 613 and cache 614. The device 600 may optionally include a mass storage device 615 such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data. The device 600 may also optionally include a display unit 616 and user interface unit 618 to facilitate interaction between the device 600 and a user. The support functions 610, mass storage 615, display 616 and user interface 618 may be coupled to the processor and/or memory by a data bus 620. The display unit 616 may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, graphical symbols or images. The user interface 618 may include a keyboard, mouse, joystick, light pen or other device. As shown the particular example depicted in FIG. 6, the user interface 618 may be incorporated into the control module 606. The device 600 may also include a network interface 622 to enable the device to communicate with other devices over a network 625, such as the internet. These components may be implemented in hardware, software or firmware or some combination of two or more of these. The host device 600 may send test data packets 624 and send and receive message packets 626 over the network 625. Each test packet may include a request REQ for a remotely located host to reply to the host device 600 if the test the remote host receives the test packet. The request REQ may also request the remote host to provide a data transfer rate for the test packet with the reply.


Embodiments of the present invention are related to a Path MTU discovery technique that does not depend on any requirement to the underlying protocol (e.g., DF bit, etc) and can effectively determine path MTU size.


While the above is a complete description of the preferred embodiment of the present invention, it is possible to use various alternatives, modifications and equivalents. Therefore, the scope of the present invention should be determined not with reference to the above description but should, instead, be determined with reference to the appended claims, along with their full scope of equivalents. Any feature described herein, whether preferred or not, may be combined with any other feature described herein, whether preferred or not. In the claims that follow, the indefinite article “A”, or “An” refers to a quantity of one or more of the item following the article, except where expressly stated otherwise. In the claims that follow, the expressions first and second are used to distinguish between different elements and do not imply any particular order or sequence. The appended claims are not to be interpreted as including means-plus-function limitations, unless such a limitation is explicitly recited in a given claim using the phrase “means for.”

Claims
  • 1. A method for network communication between a first host and a second host connected by a network, the method comprising: a) sending a plurality of test packets of varying transmission unit (TU) size from the first host to the second host, wherein a “do not fragment” (DF) flag for the test packets is not set;b) determining whether one or more of the test packets were received by the second host;c) determining whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host;c′) calculating an estimated path MTU size for the network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited; andd) sending one or more message packets of a size less than or equal to the estimated path MTU size over the network path.
  • 2. The method of claim 1, wherein a) comprises sending one or more test packets of increasing size compared to an initial TU size from the first host to the second host.
  • 3. The method of claim 2 wherein an increase in size of the test packets is characterized by a TU size increase that is less than or equal to about 50% of the initial TU size.
  • 4. The method of claim 2 wherein the initial TU size is 576 bytes.
  • 5. The method of claim 1, wherein c) further comprises determining a data transfer rate of test packets received by the second host and wherein the one or more patterns include a behavior of the data transfer rate of the test packets received by the second host as a function of the TU size.
  • 6. The method of claim 5 wherein determining the data transfer rate comprises including in the test packets a request for the second host to send a reply containing the data transfer rate and receiving the reply at the first host.
  • 7. The method of claim 5, wherein c′) includes determining a behavior of the data transfer rate of the packets received by the second host as a function of TU size.
  • 8. The method of claim 7, wherein c′) includes determining whether a drop in data transfer rate exhibits occurs above a particular TU size.
  • 9. The method of claim 7, wherein c′) includes determining whether the data transfer rate tends to flatten out with increasing TU size.
  • 10. The method of claim 1 wherein the plurality of test packets comprise a plurality of User Datagram Protocol (UDP) packets.
  • 11. The method of claim 1, further comprising repetitively testing the path between the first and second host with packets of varying size to verify a previous result.
  • 12. The method of claim 1 wherein a) comprises, sending one or more packets at an initial bandwidth and gradually increasing the bandwidth for subsequent packets until bandwidth saturation is detected.
  • 13. The method of claim 1 wherein d) takes place by peer-to-peer connection between an application layer of a protocol stack on the first host and an application layer of a protocol stack on the second host.
  • 14. The method of claim 1 wherein d) takes place by peer-to-peer connection between a transport layer of a protocol stack on the first host and a transport layer of a protocol stack on the second host.
  • 15. An apparatus for discovery of a maximum transmission unit (MTU) size in a path between two nodes of a network, comprising: a first host having a computer with a computer readable program stored in a memory the wherein the program is configured to implement a method for discovery of a path MTU size between the two nodes, the method comprising:sending a plurality of test packets of varying transmission unit (TU) size from the first host to the second host;determining whether one or more of the test packets were received by the second host;determining whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host; andcalculating an estimated path MTU size for a network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited.
  • 16. The apparatus of claim 15 wherein the first host includes a protocol stack having an application layer, wherein the program is configured to be implemented at the application layer.
  • 17. A non-transitory computer readable medium encoded with a computer readable program configured to implement a method for discovery of a maximum transmission unit (MTU) size in a path between a first host and a second host connected by a network, the computer readable program comprising: a) an instruction that when executed causes the first host to send a plurality of test packets of varying transmission unit (TU) size from the first host to the second host; b) an instruction that when executed causes the first host to determine whether one or more of the test packets were received by the second host b′) an instruction, that when executed causes the first host to determine whether a network path between the first and second hosts is bit-rate limited or packet rate limited based on one or more patterns of receipt of the test packets by the second host; and c) an instruction that when executed causes the first host to calculate an estimated path MTU size for a network path between the first and second hosts based on the one or more patterns of receipt of the test packets by the second host, wherein the path MTU size is determined differently if the network path is bit-rate limited than if the network path is packet rate limited.
US Referenced Citations (260)
Number Name Date Kind
4764928 Akerberg Aug 1988 A
4787051 Olson Nov 1988 A
4843568 Krueger Jun 1989 A
5128671 Thomas, Jr. Jul 1992 A
5528265 Harrison Jun 1996 A
5544325 Denny et al. Aug 1996 A
5596720 Hamada et al. Jan 1997 A
5630184 Roper et al. May 1997 A
5636216 Fox et al. Jun 1997 A
5701427 Lathrop Dec 1997 A
5768382 Schneier et al. Jun 1998 A
5768531 Lin Jun 1998 A
5793763 Mayes et al. Aug 1998 A
5809016 Kreitzer et al. Sep 1998 A
5812531 Cheung et al. Sep 1998 A
5835726 Shwed et al. Nov 1998 A
5856972 Riley et al. Jan 1999 A
5898679 Brederveld et al. Apr 1999 A
5956485 Perlman Sep 1999 A
6012096 Link et al. Jan 2000 A
6058431 Srisuresh et al. May 2000 A
6128623 Mattis et al. Oct 2000 A
6128624 Papierniak et al. Oct 2000 A
6128627 Mattis et al. Oct 2000 A
6128664 Yanagidate et al. Oct 2000 A
6151584 Papierniak et al. Nov 2000 A
6151601 Papierniak et al. Nov 2000 A
6152824 Rothschild et al. Nov 2000 A
6157368 Faeger Dec 2000 A
6292834 Ravi et al. Jan 2001 B1
6208649 Kloth Mar 2001 B1
6209003 Mattis et al. Mar 2001 B1
6212565 Gupta Apr 2001 B1
6212633 Levy et al. Apr 2001 B1
6289358 Mattis et al. Sep 2001 B1
6292880 Mattis et al. Sep 2001 B1
6327630 Carroll et al. Dec 2001 B1
6333931 LaPier et al. Dec 2001 B1
6349210 Li Feb 2002 B1
6353891 Borella et al. Mar 2002 B1
6375572 Masuyama Apr 2002 B1
6389462 Cohen et al. May 2002 B1
6393488 Araujo May 2002 B1
6405104 Dougherty Jun 2002 B1
6421347 Borgstahl et al. Jul 2002 B1
6487583 Harvey et al. Nov 2002 B1
6487600 Lynch Nov 2002 B1
6535511 Rao Mar 2003 B1
6549786 Cheung et al. Apr 2003 B2
6553515 Gross et al. Apr 2003 B1
6560636 Cohen May 2003 B2
6581108 Denison et al. Jun 2003 B1
6590865 Ibaraki et al. Jul 2003 B1
6616531 Mullins Sep 2003 B1
6618757 Babbitt et al. Sep 2003 B1
6636898 Ludovici et al. Oct 2003 B1
6640241 Ozzie et al. Oct 2003 B1
6641241 Ozzie et al. Oct 2003 B2
6641481 Mai et al. Nov 2003 B1
6661799 Molitor Dec 2003 B1
6667972 Foltan et al. Dec 2003 B1
6668283 Sitaraman et al. Dec 2003 B1
6690678 Basso et al. Feb 2004 B1
6701344 Holt et al. Mar 2004 B1
6704574 Lin Mar 2004 B2
6712697 Acres Mar 2004 B2
6757255 Aoki et al. Jun 2004 B1
6760775 Anerousis et al. Jul 2004 B1
6772219 Shobatake Aug 2004 B1
6779017 Lamberton et al. Aug 2004 B1
6779035 Gbadegesin Aug 2004 B1
6789126 Saulpaugh et al. Sep 2004 B1
6799255 Blumenau et al. Sep 2004 B1
6807575 Emaru et al. Oct 2004 B1
6816703 Wood et al. Nov 2004 B1
6829634 Holt et al. Dec 2004 B1
6848997 Hashimoto et al. Feb 2005 B1
6891801 Herzog May 2005 B1
6899628 Leen et al. May 2005 B2
6920501 Chu et al. Jul 2005 B2
6934745 Krautkremer Aug 2005 B2
6978294 Adams et al. Dec 2005 B1
7000025 Wilson Feb 2006 B1
7016942 Odom Mar 2006 B1
7017138 Zirojevic et al. Mar 2006 B2
7035911 Lowrey et al. Apr 2006 B2
7043641 Martinek et al. May 2006 B1
7065579 Traversat et al. Jun 2006 B2
7082316 Eiden et al. Jul 2006 B2
7096006 Lai et al. Aug 2006 B2
7107348 Shimada et al. Sep 2006 B2
7120429 Minear et al. Oct 2006 B2
7123608 Scott et al. Oct 2006 B1
7127613 Pabla et al. Oct 2006 B2
7130921 Goodman et al. Oct 2006 B2
7133368 Zhang et al. Nov 2006 B2
7134961 Hora Nov 2006 B2
7155515 Brown et al. Dec 2006 B1
7155518 Forslow Dec 2006 B2
7168089 Nguyen et al. Jan 2007 B2
7174382 Ramanathan et al. Feb 2007 B2
7177950 Narayan et al. Feb 2007 B2
7177951 Dykeman et al. Feb 2007 B1
7185138 Galicki Feb 2007 B1
7194654 Wray et al. Mar 2007 B2
7197565 Abdelaziz et al. Mar 2007 B2
7203841 Jackson et al. Apr 2007 B2
7216359 Katz et al. May 2007 B2
7240093 Danieli et al. Jul 2007 B1
7243141 Harris Jul 2007 B2
7254709 Richard Aug 2007 B1
7263070 Delker et al. Aug 2007 B1
7272636 Pabla Sep 2007 B2
7321928 Feltin et al. Jan 2008 B2
7340500 Traversat et al. Mar 2008 B2
7346015 Shipman Mar 2008 B2
7392375 Bartram et al. Jun 2008 B2
7398388 Xu et al. Jul 2008 B2
7407434 Thomas et al. Aug 2008 B2
7426185 Musacchio et al. Sep 2008 B1
7429215 Rozkin et al. Sep 2008 B2
7451490 Pirich et al. Nov 2008 B2
7457279 Scott et al. Nov 2008 B1
7489631 Ilan Feb 2009 B2
7529193 Zimmerman et al. May 2009 B2
7533172 Traversat et al. May 2009 B2
7573886 Ono Aug 2009 B1
7680047 Vadlakonda et al. Mar 2010 B2
7788354 Nag Aug 2010 B2
7792902 Chatani Sep 2010 B2
7803052 Multerer et al. Sep 2010 B2
7831666 Chatani Nov 2010 B2
7859998 Wade et al. Dec 2010 B2
20010005368 Rune Jun 2001 A1
20010017856 Asokan et al. Aug 2001 A1
20010021188 Fujimori et al. Sep 2001 A1
20010044339 Cordero et al. Nov 2001 A1
20010046213 Sakoda Nov 2001 A1
20020002074 White et al. Jan 2002 A1
20020006114 Bjelland et al. Jan 2002 A1
20020013838 Kushida et al. Jan 2002 A1
20020016826 Johansson et al. Feb 2002 A1
20020035604 Cohen et al. Mar 2002 A1
20020049841 Johnson et al. Apr 2002 A1
20020055989 Stringer-Calvert et al. May 2002 A1
20020075844 Hagen Jun 2002 A1
20020085097 Colmenarez et al. Jul 2002 A1
20020097732 Worster et al. Jul 2002 A1
20020107786 Lehmann-Haupt et al. Aug 2002 A1
20020107935 Lowery et al. Aug 2002 A1
20020119821 Sen et al. Aug 2002 A1
20020138471 Dutta et al. Sep 2002 A1
20020143855 Traversat et al. Oct 2002 A1
20020147810 Traversat et al. Oct 2002 A1
20020161821 Narayan et al. Oct 2002 A1
20020183004 Fulton et al. Dec 2002 A1
20020184310 Traversat et al. Dec 2002 A1
20020184311 Traversat et al. Dec 2002 A1
20030027634 Matthews, III Feb 2003 A1
20030028585 Yeager et al. Feb 2003 A1
20030045359 Leen et al. Mar 2003 A1
20030046292 Subramanian et al. Mar 2003 A1
20030051052 Shteyn et al. Mar 2003 A1
20030055892 Huitema et al. Mar 2003 A1
20030055978 Collins Mar 2003 A1
20030079003 Burr Apr 2003 A1
20030084282 Taruguchi May 2003 A1
20030097408 Kageyama et al. May 2003 A1
20030104829 Alzoubi et al. Jun 2003 A1
20030115258 Baumeister et al. Jun 2003 A1
20030126229 Kantor et al. Jul 2003 A1
20030126245 Feltin et al. Jul 2003 A1
20030135625 Fontes et al. Jul 2003 A1
20030152034 Zhang et al. Aug 2003 A1
20030158961 Nomura et al. Aug 2003 A1
20030162556 Libes Aug 2003 A1
20030177187 Levine et al. Sep 2003 A1
20030182421 Faybishenko et al. Sep 2003 A1
20030182428 Li et al. Sep 2003 A1
20030191828 Ramanathan et al. Oct 2003 A1
20030217096 McKelvie et al. Nov 2003 A1
20030217135 Chatani et al. Nov 2003 A1
20030227939 Yukie et al. Dec 2003 A1
20030229779 Morais et al. Dec 2003 A1
20030229789 Morais et al. Dec 2003 A1
20030233281 Takeuchi et al. Dec 2003 A1
20040007618 Oram et al. Jan 2004 A1
20040015548 Lee Jan 2004 A1
20040018839 Andric et al. Jan 2004 A1
20040024879 Dingman et al. Feb 2004 A1
20040063497 Gould Apr 2004 A1
20040085947 Ekberg et al. May 2004 A1
20040087369 Tanaka May 2004 A1
20040088369 Yeager et al. May 2004 A1
20040103179 Damm et al. May 2004 A1
20040110563 Tanaka Jun 2004 A1
20040133631 Hagen et al. Jul 2004 A1
20040139228 Takeda et al. Jul 2004 A1
20040162871 Pabla et al. Aug 2004 A1
20040181463 Goldthwaite et al. Sep 2004 A1
20040207880 Thakur Oct 2004 A1
20040212589 Hall et al. Oct 2004 A1
20040236863 Shen et al. Nov 2004 A1
20040236945 Risan et al. Nov 2004 A1
20040243665 Markki et al. Dec 2004 A1
20040249891 Khartabil et al. Dec 2004 A1
20040254977 Zhang Dec 2004 A1
20040267876 Kakivaya et al. Dec 2004 A1
20050007964 Falco et al. Jan 2005 A1
20050015626 Chasin Jan 2005 A1
20050020354 Nguyen et al. Jan 2005 A1
20050026698 Pirich et al. Feb 2005 A1
20050063409 Oommen Mar 2005 A1
20050064939 McSheffrey et al. Mar 2005 A1
20050065632 Douglis et al. Mar 2005 A1
20050074007 Samuels et al. Apr 2005 A1
20050080858 Pessach Apr 2005 A1
20050086287 Datta Apr 2005 A1
20050086288 Datta et al. Apr 2005 A1
20050086329 Datta et al. Apr 2005 A1
20050086350 Mai Apr 2005 A1
20050086369 Mai et al. Apr 2005 A1
20050105526 Stiemerling et al. May 2005 A1
20050141522 Kadar et al. Jun 2005 A1
20050149481 Hesselink et al. Jul 2005 A1
20050221858 Hoddie Oct 2005 A1
20050250487 Miwa Nov 2005 A1
20050251577 Guo et al. Nov 2005 A1
20050259637 Chu et al. Nov 2005 A1
20050262411 Vertes Nov 2005 A1
20060063587 Manzo Mar 2006 A1
20060067920 Jensen Mar 2006 A1
20060068702 Miwa Mar 2006 A1
20060075127 Juncker et al. Apr 2006 A1
20060084504 Chan et al. Apr 2006 A1
20060111979 Chu May 2006 A1
20060114918 Ikeda et al. Jun 2006 A1
20060209822 Hamamoto Sep 2006 A1
20060218624 Ravikumar et al. Sep 2006 A1
20060288103 Gobara et al. Dec 2006 A1
20070058792 Chaudhari et al. Mar 2007 A1
20070061460 Khan et al. Mar 2007 A1
20070077981 Hungate et al. Apr 2007 A1
20070081459 Segel et al. Apr 2007 A1
20070115963 Vadlakonda et al. May 2007 A1
20070150552 Harris et al. Jun 2007 A1
20070165629 Chaturvedi et al. Jul 2007 A1
20070191109 Crowder et al. Aug 2007 A1
20070198418 MacDonald et al. Aug 2007 A1
20070208748 Li Sep 2007 A1
20070213124 Walker et al. Sep 2007 A1
20070217436 Markley et al. Sep 2007 A1
20070237153 Slaughter et al. Oct 2007 A1
20090077245 Smelyansky et al. Mar 2009 A1
20090094370 Jacob et al. Apr 2009 A1
20090111532 Slaughter et al. Apr 2009 A1
20090138610 Gobara et al. May 2009 A1
20090228593 Takeda Sep 2009 A1
20090240821 Juncker et al. Sep 2009 A1
20110035501 Takeda Feb 2011 A1
Foreign Referenced Citations (22)
Number Date Country
0 913 965 May 1999 EP
1 107 508 Jun 2001 EP
1 374 959 May 2003 EP
2001 53901 Feb 2001 JP
2002 10321 Jan 2002 JP
2004 135778 May 2004 JP
2004 136009 May 2004 JP
2004 141225 May 2004 JP
2005 319047 Nov 2005 JP
2005 323116 Nov 2005 JP
2005 323117 Nov 2005 JP
WO 9935799 Jul 1999 WO
WO 0197485 Dec 2001 WO
WO 0203217 Jan 2002 WO
0211366 Feb 2002 WO
WO 0223822 Mar 2002 WO
03069495 Aug 2003 WO
2004038541 May 2004 WO
WO2004063843 Jul 2004 WO
WO2005088466 Sep 2005 WO
WO2007041417 Apr 2007 WO
WO 2009073312 Jun 2009 WO
Related Publications (1)
Number Date Country
20080298376 A1 Dec 2008 US