Copending U.S. patent application Ser. No. 10/351,030, titled “Reconfigurable Semantic Processor,” filed by Somsubhra Sikdar on Jan. 24, 2003, is incorporated herein by reference.
This invention relates generally to data communications, and more specifically to methods and apparatus for isolating transmission control protocol (TCP) sessions.
In the data communications field, networking devices such as servers typically use packets when communicating over a network. A packet is a finite-length (generally several tens to several thousands of octets) digital transmission unit comprising one or more header fields and a data field. The data field may contain virtually any type of digital data. The header fields convey information (in different formats depending on the type of header and options) related to delivery and interpretation of the packet contents. This information may, e.g., identify the packet's source or destination, identify the protocol to be used to interpret the packet, identify the packet's place in a sequence of packets, provide an error correction checksum, or aid packet flow control. The finite length of a packet can vary based on the type of network that the packet is to be transmitted through and the type of application used to present the data.
Typically, packet headers and their functions are arranged in an orderly fashion according to the open-systems interconnection (OSI) reference model. This model partitions packet communications functions into layers, each layer performing specific functions in a manner that can be largely independent of the functions of the other layers. For instance, a network layer, typically implemented with the well-known Internet Protocol (IP), provides network-wide packet delivery and switching functionality, while a higher-level transport layer can provide mechanisms for end-to-end delivery of packets. As such, each layer can prepend its own header to a packet, and regard all higher-layer headers as merely part of the data to be transmitted.
Transmission Control Protocol (TCP) is a transport layer used to provide mechanisms for highly-reliable end-to-end delivery of packet streams during an established TCP session. Traditionally, the establishment of a TCP session requires a three-way handshake between communicating endpoints. This three-way handshaking allows TCP endpoints to exchange socket information uniquely identifying the TCP session to be established, and to exchange initial sequence numbers and window sizes used in the packet sequencing, error recovery, and flow control. An example of a typical three-way handshake may include a first TCP endpoint sending a synchronize SYN packet to a second TCP endpoint, the second TCP endpoint responding with a synchronize and acknowledgment SYN-ACK packet, and the first TCP endpoint sending an acknowledgement ACK packet in response to the SYN-ACK packet. TCP further requires a similar exchange of termination FIN packets and acknowledgments to the FIN packets when closing an existing TCP session. Thus to use TCP in data exchanges, TCP endpoints must be able maintain information regarding the state of each of its TCP sessions, e.g., opening a TCP session, waiting for acknowledgment, exchanging data, or closing a TCP session.
A commonly exploited weakness of TCP stems from this maintenance of state information. For instance, in a SYN flood denial-of-service attack, multiple SYN packets are received by a TCP endpoint, each requesting the establishment of a different TCP session. The initiator of the attack, however, does not have any intention of completing the corresponding three-way handshakes, often times providing a fictitious source port to ensure their failure. Responding to this flood of SYN packets allocates the TCP endpoint's limited processing resources by requiring it maintain state information for each session opening while waiting for acknowledgments that will never arrive. Another attack that misallocates processing resources involves receiving packets for a session that conflict with the maintained state information, e.g., sending a SYN packet in an already established session or a FIN packet for a session that has not been established.
Once a TCP session is properly established, TCP endpoints may exchange data in a TCP packet stream. Since packets may be lost, or arrive out-of-order during transmission, TCP provides mechanisms to retransmit lost or late packets and reorder the packet stream upon arrival including discarding duplicate packets. TCP endpoints may also be required to perform other exception processing prior to the TCP reordering, such as reassembling lower-layer fragmented packets, e.g, IP fragments, and/or performing cryptography operations, e.g., according to an Internet Protocol Security (IPSec) header(s). Thus use of TCP to reliably exchange packet streams comes at a cost of efficiency in TCP endpoint processing and increased vulnerability to TCP-based attacks. Accordingly, a need remains for an improved system and method for communicating over a network using TCP.
The invention may be best understood by reading the disclosure with reference to the drawings, wherein:
Direct network communication using Transmission Control Protocol (TCP) may increase a networking device's vulnerability to TCP-based attacks and require additional processing of packets upon arrival. The addition of a proxy TCP endpoint designed to specifically perform the direct TCP-based network communications, shields networking devices from potential attacks and increases their processing efficiency. Embodiments of the present invention will now be described in more detail.
The proxy 200 maintains at least one TCP session over the network 120 and a corresponding local session with the networking device 140. In some embodiments, the local session may be a TCP session established with the networking device 140 through a private network, e.g., a company enterprise network, Internet Service Provider (ISP) network, home network, etc. The proxy 200 functions as a network communications intermediary for networking device 140 by translating data between the local and TCP sessions. For instance, when receiving packetized data from the network 120 in a TCP session, the proxy 200 may sequence and depacketize the data prior to providing it to the networking device 140 in the local session. The depacketization may include reassembling Internet Protocol (IP) fragments, and/or performing cryptography operations, e.g., according to the Internet Protocol Security (IPSec) header(s). This sequencing and processing by proxy 200 allows the networking device 140 to receive a uniform data stream in the local session, ensuring quality-of-service (QOS) for the networking device 140 and control over network bandwidth usage.
Since the proxy 200 is the endpoint for the network communications, not networking device 140, the TCP session has a TCP signature of the proxy 200, thus concealing the identity of the networking device 140 from the network 120. This concealment of the networking device 140 limits its exposure to network-based attacks. The proxy 200 may perform Network Address Translation (NAT) of destination and source IP addresses to help hide the identity of the networking device 140. The proxy 200 may be implemented at any network interface, such as a firewall.
In some embodiments, proxy 200 may provide network communication and processing for multiple networking devices 140. In these embodiments, the management of network communication at a single network interface point may allow proxy 200 to provide additional functionality for increasing the efficiency of the network management and packet processing. For instance, when the proxy 200 discovers network changes, e.g., next hop change, Internet Control Message Protocol (ICMP) fragments, packet loss, etc., in one of the TCP sessions, the changes may be applied to all of the TCP sessions. This becomes especially powerful when combined with the full neighbor implementation of Border Gateway Protocol (BGP) or other link state routing protocol that is aware of the entire topology of network 120. Additionally, since the proxy 200 maintains multiple sessions, the status and statistics of these sessions can be accessed at a single network interface point.
The structure and operation of proxy 200 for some embodiments of the invention will be explained with reference to
The network-interface proxy 210 includes a TCP state machine 212 to establish and manage the TCP sessions over the network 120, including maintaining state information for each TCP session and implementing packet sequencing, error recovery and flow control mechanisms. The TCP state machine 212 sequences and processes packet streams received over the TCP sessions and provides the sequenced payload data to the device-interface proxy 220. Because TCP state machine 212 previously sequenced and processed the payload data, the device-interface proxy 220 is then capable of providing a uniform data stream to networking device 140 in the local session. The TCP state machine 212 further packetizes payload data received from device-interface proxy 220 and transmits it over the corresponding TCP session.
The device-interface proxy 220 may include a TCP state machine 222 to establish and manage local TCP sessions with the networking device 140. TCP state machine 222 operates similarly to TCP state machine 212 with respect to packet streams over the local TCP sessions.
According to a next block 320, the proxy 200 receives a packet stream in the TCP session 122 over the network 120. The proxy 200 manages the TCP session 122 by providing error recovery for lost or late packets and flow rate control by adjusting the size of the TCP window.
According to a next block 330, the proxy 200 translates data from the packet stream to the local session 124. The translation includes sequencing and depacketizing the data, e.g., with the network-interface proxy 210, and providing the data to the networking device 140 in the local session 124. The sequencing may include reordering of those packets received out-of-order and discarding duplicated packets, while the depacketization may include any additional processing that may be required, such as reassembly of IP fragmented packets and/or performance of cryptography operations according to IPSec headers. Although the flowchart 300 shows data transfers from the network 120 to the networking device 140, proxy 200 may also provide data in the opposite direction. The proxy 200 provides operations that are not typically provided in firewalls. However, the proxy 200 can also include, in addition to the TCP proxy operations, other conventional firewall operations
A PCI-X interface 480 is coupled to the input buffer 430, the output buffer 440, and an external PCI bus 482. The PCI bus 482 can connect to other PCI-capable components, such as disk drives, interfaces for additional network ports, other semantic processors, etc. The PCI-X interface 480 provides data streams or packets to input buffer 430 from PCI bus 482 and transmits data streams packets over PCI bus 482 from output buffer 440.
Semantic processor 400 includes a direct execution parser (DXP) 450 that controls the processing of packets in the input buffer 430 and a semantic processing unit (SPU) 460 for processing segments of the packets or for performing other operations. The DXP 450 maintains an internal parser stack (not shown) of non-terminal (and possibly also terminal) symbols, based on parsing of the current input frame or packet up to the current input symbol. When the symbol (or symbols) at the top of the parser stack is a terminal symbol, DXP 450 compares data DI at the head of the input stream to the terminal symbol and expects a match in order to continue. When the symbol at the top of the parser stack is a non-terminal (NT) symbol, DXP 450 uses the non-terminal symbol NT and current input data DI to expand the grammar production on the stack. As parsing continues, DXP 450 instructs a SPU 460 to process segments of the input, or perform other operations.
Semantic processor 400 uses at least three tables. Code segments for SPU 460 are stored in semantic code table 456. Complex grammatical production rules are stored in a production rule table (PRT) 454. Production rule (PR) codes 453 for retrieving those production rules are stored in a parser table (PT) 452. The PR codes 453 in parser table 452 also allow DXP 450 to detect whether, for a given production rule, a code segment from semantic code table 456 should be loaded and executed by SPU 460.
The production rule (PR) codes 453 in parser table 452 point to production rules in production rule table 454. PR are stored, e.g., in a row-column format or a content-addressable format. In a row-column format, the rows of the table are indexed by a non-terminal symbol NT on the top of the internal parser stack, and the columns of the table are indexed by an input data value (or values) DI at the head of the input. In a content-addressable format, a concatenation of the non-terminal symbol NT and the input data value (or values) DI can provide the input to the parser table 452. Preferably, semantic processor 400 implements a content-addressable format, where DXP 450 concatenates the non-terminal symbol NT with 8 bytes of current input data DI to provide the input to the parser table 452. Optionally, parser table 452 concatenates the non-terminal symbol NT and 8 bytes of current input data DI received from DXP-450.
Input buffer 430 includes a recirculation buffer 432 to buffer data steams requiring additional passes through the DXP 450. DXP 450 parses data streams from recirculation buffer 432 similarly to those received through input port 410 or PCI bus 482.
The semantic processor 400 includes a memory subsystem 470 for storing or augmenting segments of the packets. When prompted by the DXP 450 in response the parsing of packet headers, the SPU 460 may sequence TCP packets and/or collect and assemble IP fragmented packets within memory subsystem 470. The memory subsystem 470 may also perform cryptography operations on data streams, including encryption, decryption, and authentication, when directed by SPU 450. Once reassembled and/or processed in the memory subsystem 470, the packets or their headers with a specialized NT symbol may be sent to the recirculation buffer 432 for additional parsing by DXP 450.
In certain state-dependent protocols, such as TCP, the reception order of packets gives rise to semantics that may be exploited by this semantic processing architecture. For instance, the reception of a TCP SYN packet indicates to the DXP 450 an attempt to establish a TCP session, however if the session has already been established there is no further need to allocate resources to complete the processing of the packet, acknowledge its arrival, or maintain corresponding state information. Thus any TCP packet may be correct syntactically, but out-of-sequence with regard to the state of the TCP session. The semantic processor 400 recognizes these packet-ordering semantics and implements a TCP state machine, such as 212 or 222 in FIG. 3, for managing the required TCP interactions and maintaining the state information for TCP sessions.
In a next decision block 530, the semantic processor 400 determines whether the received TCP packet corresponds to a TCP session maintained by semantic processor 400. The memory subsystem 470 maintains information for each active TCP session with semantic processor 400, including the current state of the session, packet sequencing, and window sizing. The SPU 460, when directed by the DXP 450, performs a lookup within memory subsystem 470 for a maintained TCP session that corresponds to the received TCP packet.
When a TCP session corresponding to the TCP packet is maintained within semantic processor 400, in a next decision block 540, the semantic processor 400 determines whether the TCP packet coincides with the current state of the TCP session. The SPU 460 may retrieve the state of the maintained TCP session, e.g., one or more non-terminal (NT) symbols, for the DXP 450. These NT symbols point to specialized grammatical production rules that correspond to each of the TCP states and control how the DXP 450 parses the TCP packet.
For instance, when the TCP packet is a SYN packet and its corresponding TCP session is already established, the TCP SYN packet does not coincide with the state of the TCP session and thus is discarded (at block 580) without further processing. Alternatively, when the TCP packet is a TCP data packet or a TCP FIN packet in an already established TCP session, the DXP 450 parses the packet according to the state of the TCP session in a next block 550.
Upon completion of parsing by the DXP 450, the SPU 460 may forward the 5 TCP packet to the destination address for a networking device 140, or send the payload to another semantic processor 400 where it is provided to the networking device 140 in a local session 124. The SPU 460 performs any reassembly or cryptography operations, including decryption and/or authentication, before forwarding the packets in the TCP session to the networking device 140. The processed packets are provided to output buffer 440, or to PCI bus 482 via PCI-X interface 480, after the processing operations have been completed by SPU 460.
When, at decision block 530, a TCP session corresponding to the TCP packet is not maintained within semantic processor 400, in a next decision block 560, the semantic processor 400 determines whether the TCP packet is a SYN packet attempting to establish a TCP session with semantic processor 400. The DXP 450 may determine if the TCP packet is a SYN packet by parsing the SYN flag in the TCP header.
When the TCP packet is not a SYN packet, in the next block 580, the semantic processor 400 discards the packet from the input buffer 430. The SPU 460 may 20 discard the packet from the input buffer 430 when directed by DXP 450.
When the TCP packet is a SYN packet, in a next block 570, the semantic processor 400 open a TCP session according to the TCP SYN packet. The SPU 460, when directed by DXP 450, executes microinstructions from semantic code table 456 that cause the SPU 460 to open a TCP session. The SPU 460 may open the TCP session by sending a TCP ACK message back to the source address identified by the TCP SYN packet and by allocating a context control block within memory subsystem 470 for maintaining information, including the state of the session, and packet sequencing and window sizing information. Execution then returns to block 510, where semantic processor 400 receives subsequent packets at input buffer 430, and the DXP 450 parses the subsequent packets corresponding to the established TCP session.
One of ordinary skill in the art will recognize that the concepts taught herein can be tailored to a particular application in many other advantageous ways. In particular, those skilled in the art will recognize that the illustrated embodiments are but one of many alternative implementations that will become apparent upon reading this disclosure.
The preceding embodiments are exemplary. Although the specification may refer to “an”, “one”, “another”, or “some” embodiment(s) in several locations, this does not necessarily mean that each such reference is to the same embodiment(s), or that the feature only applies to a single embodiment.