This application claims priority to the Indian Provisional Application No. 20/231,1058856 filed Sep. 1, 2023 of which is incorporated herein by reference in its entirety.
Examples of the present disclosure generally relate to a hardware solution for fragmenting an IP packet.
IP fragmentation is a process used in the Internet Protocol (IP) to transmit data packets that are larger than the maximum transmission unit (MTU) supported by a network. When a packet is too large to be transmitted across a network in a single piece, IP fragmentation divides the packet into smaller fragments that can be transmitted and reassembled at the destination.
One benefit of IP fragmentation includes allowing communication between networks with different MTU sizes. Networks may have diverse infrastructures, and some networks might have smaller MTU sizes due to various constraints. By fragmenting packets, data can still be transmitted across these networks without requiring network-wide MTU adjustments.
Another benefit of IP fragmentation includes data transfer flexibility enabling the transmission of large packets, such as file transfers or multimedia streams, without the need for packet segmentation on the application layer. This allows applications to send larger chunks of data in a single IP packet, simplifying the data transfer process and potentially reducing the overhead of handling multiple smaller packets.
However, current IP fragmentation techniques typically rely on software, which is slow and requires processor time that could be used to perform other tasks.
One embodiment described herein is a network interface controller or card (NIC) that includes replicator circuitry configured to replicate a packet to generate a plurality of replicated packets and fragmentation circuitry configured to shrink the plurality of replicated packets to satisfy a MTU size based on unique identifiers assigned to the plurality of replicated packets.
One embodiment described herein is an integrated circuit that includes replicator circuitry configured to replicate an internet protocol (IP) packet to generate a plurality of replicated IP packets and fragmentation circuitry configured to shrink the plurality of replicated IP packets to form packet fragments based on unique identifiers assigned to the plurality of replicated IP packets.
One embodiment described herein is a method that includes replicating, in a NIC, a packet to generate a plurality of replicated packets and shrinking, in the NIC, the plurality of replicated packets to satisfy an MTU size based on unique identifiers assigned to the plurality of replicated packets.
So that the manner in which the above recited features can be understood in detail, a more particular description, briefly summarized above, may be had by reference to example implementations, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical example implementations and are therefore not to be considered limiting of its scope.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements of one example may be beneficially incorporated in other examples.
Various features are described hereinafter with reference to the figures. It should be noted that the figures may or may not be drawn to scale and that the elements of similar structures or functions are represented by like reference numerals throughout the figures. It should be noted that the figures are only intended to facilitate the description of the features. They are not intended as an exhaustive description of the embodiments herein or as a limitation on the scope of the claims. In addition, an illustrated example need not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular example is not necessarily limited to that example and can be practiced in any other examples even if not so illustrated, or if not so explicitly described.
Embodiments herein describe creating multiple packet fragments from a large packet that, for example, exceeds a MTU supported by a network. In one embodiment, a network interface controller or card (NIC) replicates the large packet to form multiple copies (i.e., replicated packets). The number of replications can correspond to the number of fragments needed so the MTU is not exceeded. For example, if the large packet is 8000 bytes and the MTU is 2000 bytes, the NIC generates four replicated packets.
In one embodiment, the NIC assigns an identifier, such as an ID or a count value, to each replicated packet. For the first replicated packet, the NIC may cut off (i.e., truncate) the last 6000 bytes of the packet to form a first packet fragment of 2000 bytes. For the second replicated packet, the NIC may remove (or omit) 2000 bytes from the beginning of the payload and the last 4000 bytes at the end of the payload to form a second packet fragment of 2000 bytes. For the third replicated packet, the NIC may remove (or omit) 4000 bytes from the beginning of the payload and the last 2000 bytes at the end of the payload to form a third packet fragment of 2000 bytes. For the fourth replicated packet, the NIC may remove (or omit) 6000 bytes from the beginning of the payload but not remove any of the payload at the end of the packet to form a fourth packet fragment of 2000 bytes. In this manner, the combined payloads (or union) of the four packet fragments is the same as the payload in the large packet.
The NIC can then update the headers of the packet fragments and forward them on the network. Advantageously, the embodiments herein can perform IP fragmentation using only hardware elements (e.g., without involving software or a general purpose processor). That is, by assigning identifiers to replicated packets, the hardware in the NIC can track which portions of the payload to remove (e.g., how much to remove/omit at the beginning of the payload and how much to truncate at the end of the packet) in order to create packet fragments that collectively contain the entire payload of the original packet and satisfy the MTU size of the network.
The large packet 105 is first processed by evaluation circuitry 110 which determines, among other things, whether the packet 105 exceeds the MTU size of the network. If so, the NIC 100 performs IP fragmentation. As discussed above, IP fragmentation is a process used to transmit data packets that are larger than the MTU supported by a network. When a packet is too large to be transmitted across a network in a single piece, IP fragmentation divides the packet into smaller fragments that can be transmitted and reassembled at the destination.
After the evaluation circuitry 110 determines the packet 105 exceeds the MTU size, the circuitry 110 forwards the packet to an input of replicator circuitry 115 which replicates the packet 105. In one embodiment, the replicated packets 120 are exact copies of the large packet 105. That is, each of the replicated packets 120 may have the same header and payload as the large packet 105. As such, the replicated packets 120 also exceed the MTU size of the network. While the replicated packets 120 may be exact copies of the large packets 105, in other embodiments there may be small differences in the replicated packets 120 and the large packet 105 (e.g., they have different identifiers in the intrinsic or hardware headers). However, the discussion that follows will assume that the replicated packets 120 match the received packet 105.
The replicated packets 120 are received at fragmentation circuitry 125 which reduces the payload in the replicated packets 120 such that the resulting packet fragments 130 satisfy the MTU size of the network. Moreover, the fragmentation circuitry 125 can reduce the payload of the replicated packets 120 such that the union of the resulting payloads in the packet fragments 130 is the same as the payload of the large packet 105. Stated differently, the combined payloads in the packet fragments include the same data as contained in the payload of the large packet 105. Thus, no payload data is lost when performing IP fragmentation.
The fragmentation circuitry 125 can also update the headers in the packet fragments 130 so they can be reassembled in the network device that receives the fragments 130. The embodiments herein are not limited to any particular type of algorithm or technique for updating the headers in packet fragments 130. As a non-limiting example, the packets can be processed as described in Request for Comment (RFC) 791 so the fragments can be reassembled at their destination.
In one embodiment, the IP fragmentation illustrated in
The packet fragments 130 also include respective headers (20 bytes in this example). These headers may be different than the header in the large packet 105. For example, the headers for the packet fragments 130 may be updated to include information so that the network destination can reassemble the payloads in the packet fragments 130 to create the 10,000 byte data payload in the original packet 105. As mentioned above, any suitable technique (e.g., RFC 791) can be used to create and update the headers for the packet fragments 130.
At block 310, the NIC determines whether the packet is too large. For example, evaluation circuitry in the NIC may process the packet to determine whether it exceeds the MTU network associated with a destination of the packet. If not, the method 300 proceeds to block 315 where the NIC transmits the packet as is. That is, the NIC does not perform IP fragmentation before forwarding the packet to its destination.
However, if the packet is too large, the method 300 proceeds to block 320 where the NIC replicates the packets using an MTU size. For example, the NIC may replicate the packets so that there are sufficient fragments to carry the payload of the original packet without exceeding the MTU size. In one embodiment, the NIC divides the size of the packet by the MTU size to determine how many replicated packets should be generated. For example, if the packet is 8000 bytes and the MTU is 2000 bytes, the NIC creates four replicated packets. If the packet is 10,000 bytes and the MTU is 2000 bytes, the NIC creates five replicated packets, and so forth.
If the MTU size does not divide evenly into the packet size, the NIC can round up to determine the number of replicated packets. For example, if the packet is 8500 bytes and the MTU is 2000 bytes, the NIC creates five replicated packets. The five replicated packets may have different sizes (e.g., four of the packet fragments are 2000 bytes and one packet fragment is 500 bytes).
In one embodiment, the number of packet fragments is determined by the following pseudo code:
Total fragments (t)=Incoming packet payload size (p)/MTU payload size (m)[Integer division]
If ((m*t)<p) then t=t+1
Emit payload size (k=m) except last fragment whose emit payload size will be k=p−(m*(t−1))
In another embodiment, the replicated packets may have the same size—e.g., the five packets could each have a size of 1700 bytes.
At block 325, the NIC assigns an identifier to the replicated packets. The identifier can be any data structure that uniquely identifies one of the replicated packets from the other replicated packets generated at block 320. In one embodiment, the identifier can be an ID assigned to each of the replicated packets. In another embodiment, the identifier could be a count value assigned to each replicated packets. Thus, the embodiments herein at not limited to any particular technique for identifying the replicated packets.
At block 330, the NIC shrinks the replicated packets to satisfy the MTU size using the identifier. In one embodiment, the data removed from each of the replicated packets depends on its corresponding identifier. For example, assume that an 8000 byte packet is being fragmented into four 2000 byte packets. For the first replicated packet, the NIC may cut off (i.e., truncate) the last 6000 bytes of the packet to form a first packet fragment of 2000 bytes. For the second replicated packet, the NIC may remove (or omit) 2000 bytes from the beginning of the payload and the last 4000 bytes at the end of the payload to form a second packet fragment of 2000 bytes. For the third replicated packet, the NIC may remove (or omit) 4000 bytes from the beginning of the payload and the last 2000 bytes at the end of the payload to form a third packet fragment of 2000 bytes. For the fourth replicated packet, the NIC may remove (or omit) 6000 bytes from the beginning of the payload but not remove any of the payload at the end of the packet to form a fourth packet fragment of 2000 bytes. In this manner, the combined payloads (or union) of the four packet fragments are the same as the payload in the large packet.
The identifier provides a way for the hardware in the NIC (e.g., the fragmentation circuitry 125 in
At block 335, the NIC (e.g., the fragmentation circuitry) updates the headers of the packet fragments so that these fragments can be reassembled at their destination. As mentioned above, the embodiments herein are not limited to any particular type of technique or algorithm for updating the headers of packet fragments.
At block 340, the NIC transmits the packet fragments to their common destination. In one embodiment, the packet fragments are separate packets (e.g., PDUs) which can be transmitted independently through the network.
Assuming that the ingress pipeline 410 determined that IP fragmentation should be performed, a first de-parser 415 forwards the packet to a packet buffer 420. Thus, the parser 405, the ingress pipeline 410, and the de-parser 415 are one implementation of the evaluation circuitry 110 in
The packet buffer 420 replicates the large IP packet. These replicated packets may be exact copies of the large IP packet. Moreover, the number of replicated packets can depend on the size of the large IP packet and the MTU size of the network. In this example, there are four replicated packets so that the 8000 byte payload in the original payload can be reduced to a 2000 byte payload in four packet fragments.
The packet buffer 420 is one example of the replicator circuitry 115 described in
A second parser 425 receives the replicated packets from the packet buffer 420. The replicated packets are then processed in an egress pipeline 430 which can reduce the size of the payloads in the replicated packets (so the packets satisfy the MTU size) and update the headers of the packets according to an IP fragmentation algorithm. Moreover, the egress pipeline 430 can perform other functions than the ones described here.
A second de-parser 435 then outputs the IP fragments, which satisfy the MTU size and can be transmitted to the destination on the network. The second parser 425, the egress pipeline 430, and the second de-parser 435 are one implementation of the fragmentation circuitry 125 in
In one embodiment, the IP fragmentation illustrated in
At block 505, the NIC generates an intrinsic header for each replicated packet. The intrinsic header is separate from the IP headers. The intrinsic header can store information that is used by the egress pipeline 430 when processing the replicated packets. The intrinsic headers can be created by the packet buffer 420, the second parser 425, or some other hardware component in the NIC 400. In one embodiment, the intrinsic headers include metadata used by the egress pipeline such as timestamps and basic hardware information.
At block 510, the NIC stores a span ID for each replicated packet in its corresponding intrinsic header. That is, in addition to storing timestamps and other information, the intrinsic headers can store span IDs which uniquely identify the replicated packets from each other.
At block 515, the egress pipeline identifies a size of a portion at the beginning of the payloads in the replicated packets to remove using the span IDs. This is illustrated in
The egress pipeline can then use the value of the Span ID to determine how much of the data at the beginning of the payload in each packet should be omitted or removed. For the original packet, which has a Span ID 0, none of the payload at the beginning of the packet is removed. For the first replica packet, which has a Span ID 1, 2000 bytes of the payload at the beginning of the packet is removed. For the second replica packet, which has a Span ID 2, 4000 bytes of the payload at the beginning of the packet is removed. For the third replica packet, which has a Span ID 3, 6000 bytes of the payload at the beginning of the packet is removed. This is indicated in
The amount of data removed from the beginning of the payload can be expressed as fragment_payload_size*i where the fragment_payload_size is the desired payload of the fragment to satisfy the MTU size and i is the value of the Span ID.
Returning to the method 500, at block 520 the egress pipeline determines whether to truncate the end of the payloads. This is also illustrated in
The portion of the payload that is not removed from each of the packets is shown by the dotted box. All the data below this box is truncated or removed. Thus, the resulting packet fragments each have 2000 byte payloads which correspond to different 2000 byte chunks of the original payload. For example, the packet fragment generated from the original packet has the first 2000 byte chunk of the original payload, the packet fragment generated from the first replica packet has the second 2000 byte chunk of the original payload, the packet fragment generated from the second replica packet has the third 2000 byte chunk of the original payload, and the packet fragment generated from the third replica packet has the fourth 2000 byte chunk of the original payload.
At block 525, the egress pipeline removes one or more of the portions of the payloads from the replicated packets to satisfy the MTU size. That is, the egress pipeline can remove portions from the beginning and the end of the payloads as shown in
In the preceding, reference is made to embodiments presented in this disclosure. However, the scope of the present disclosure is not limited to specific described embodiments. Instead, any combination of the described features and elements, whether related to different embodiments or not, is contemplated to implement and practice contemplated embodiments. Furthermore, although embodiments disclosed herein may achieve advantages over other possible solutions or over the prior art, whether or not a particular advantage is achieved by a given embodiment is not limiting of the scope of the present disclosure. Thus, the preceding aspects, features, embodiments and advantages are merely illustrative and are not considered elements or limitations of the appended claims except where explicitly recited in a claim(s).
As will be appreciated by one skilled in the art, the embodiments disclosed herein may be embodied as a system or method. Accordingly, aspects may take the form of an entirely hardware embodiment that may all generally be referred to herein as a “circuit,” “module” or “system.”
Aspects of the present disclosure are described below with reference to flowchart illustrations and/or block diagrams of methods, and apparatus (systems) according to embodiments presented in this disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by one circuit or multiple circuits.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various examples of the present invention. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts.
While the foregoing is directed to specific examples, other and further examples may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Number | Date | Country | Kind |
---|---|---|---|
202311058856 | Sep 2023 | IN | national |