The present invention generally relates to communication networks and, more particularly, handling large quantities of network connections at a server.
Over time the number of products and services provided to users of telecommunication products has grown significantly. Technology advanced and wireless phones of varying capabilities were introduced which had access to various services provided by network operators, e.g., data services. More recently there are numerous devices, e.g., so called “smart” phones and tablets, which can access communication networks in which the operators of the networks, and other parties, provide many different types of services, applications, etc. This has resulted in an increased amount of network traffic which in turn caused an increasing demand for high performing servers.
Existing operating systems consume a certain amount of random access memory (RAM) memory per open transmission control protocol (TCP) socket or TCP connection, e.g., to maintain read and write buffers for the socket, etc. This results in a hard limit on the capacity of a server to handle large amounts of TCP connections. Since a large portion of the TCP connections are open but not transferring data all of the time this adds to the somewhat inefficient consumption of the RAM memory available in a server.
Existing operating systems thus have problems handling large amounts of parallel TCP connections due to having a limited amount RAM memory. It is common for servers to handle large amounts of traffic of different kinds, including long-lived connections with relatively sparse traffic exchanges that coexist with connections used for bulk transfer. Long-lived connections consume system resources throughout their existence and a large number of such long-lived connections have a large impact on the available RAM at the server, even though these connections do not consume much in terms of other network resources, e.g., bandwidth.
Examples of proxy productions systems which are required to handle two million parallel connections per blade are not uncommon to find in use today. This amount of connections per blade server results in high requirements on the RAM memory with deployments of up to 256 GB of RAM. However addressing the problem of RAM consumption due to network socket support simply by continuing to add more RAM to newer servers is an unscalable solution due to cost.
Virtual memory is the combination of physical memory and swap space on disk. Although swapping is an automatic way of allowing higher memory utilization, the kernel itself cannot swap memory. Thus, for the above-described requirements on massively parallel connections the memory consumption of the sockets in the kernel is a limiting factor which cannot be alleviated by employing virtual memory. Additionally, moving to a user space TCP/Internet Protocol (IP) stack will allow swapping of all the memory to disk but the control of when the swapping occurs is not decided by the application itself. Thus even active connections can be swapped to disk incurring a large delay and reducing the throughput making this approach undesirable.
Thus, there is a need to provide methods and devices that overcome the above-described drawbacks of the associated with handling a large quantity of network connections.
Embodiments allow for handling large amounts of parallel network connections with a limited amount of RAM by saving a socket to a persistent storage based on certain criteria and then releasing that socket from RAM. The socket can be re-activated when new data arrives on its associated network connection.
According to an embodiment, there is a method for handling network connections in a server. The method includes: creating a network socket for a network connection in a first memory; monitoring the network connection for activity; and storing state information associated with the network socket in a second memory when there is no activity on the network connection for a predetermined period of time.
According to an embodiment, there is a server for handling network connections. The server includes: a first memory in which a network socket for a network connection is created; a processor which monitors the network connection for activity; and a second memory in which state information associated with the network socket in the second memory when there is no activity on the network connection for a predetermined period is stored.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate one or more embodiments and, together with the description, explain these embodiments. In the drawings:
The following description of the embodiments refers to the accompanying drawings. The same reference numbers in different drawings identify the same or similar elements. The following detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims. The embodiments to be discussed next are not limited to the configurations described below, but may be extended to other arrangements as discussed later.
Reference throughout the specification to “one embodiment” or “an embodiment” means that a particular feature, structure or characteristic described in connection with an embodiment is included in at least one embodiment of the present invention. Thus, the appearance of the phrases “in one embodiment” or “in an embodiment” in various places throughout the specification is not necessarily all referring to the same embodiment. Further, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
As described in the Background, there are problems associated with current methods of handling a large quantity of network connections. Embodiments allow for handling large amounts of parallel network connections with a limited amount of random access memory (RAM) by saving the network socket to a persistent storage based on certain criteria and then releasing that network socket from RAM. The network socket can be re-activated when new data arrives on its associated network connection
A server typically creates a network socket when it receives a data segment with a particular flag set. For example, a transmission control protocol (TCP) server creates a TCP socket when it receives a TCP segment with the SYN flag set. By generically using the term “socket” in the description, it is to be understood that embodiments can be applied to TCP sockets, user datagram protocol (UDP) sockets and other types of network sockets and associated features/items, e.g., segment, server, flag, port, connection, etc.
Prior to discussing various embodiments, some terminology is first introduced. De-multiplexing, as used herein, describes the process of associating an IP datagram with a process and/or network socket listening to a specific network port. Serialization, as used herein, refers to a process of determining that a network socket which is established in RAM should be de-established in RAM and have its state information stored in secondary memory. De-serialization refers to the reverse process, i.e., the case where a socket has its state information stored in secondary memory, which state information is used to re-establish that socket in RAM as part of the de-multiplexing process.
One characteristic which can be monitored to determine if a particular network socket should be serialized is the socket's usage over time. According to an embodiment, each network socket can be associated with an inactivity timer which is reset whenever there is activity on its network connection. When the timer reaches a configured timeout value, a serialization process is initiated where state information associated with the network socket is stored in a socket-cache located in a secondary memory or storage, e.g., a persistent or non-volatile memory. A hash is computed from the connection five tuple in order to create a unique identification for the serialized socket. An example of a five tuple for TCP is “192.160.111.100/40111/71.100.122.70/71/6” for a packet arriving from port 40111 of IP address 192.160.111.100 with the packet arriving at port 71 of IP address 71.100.122.70 and using TCP. Similarly, a five tuple can be created for other protocols, e.g., UDP.
A purely illustrative example of a hash using the above-described five tuple can be mapped out as shown in Equation 1.
hash=(ip_source*Z)XOR ip_destination XOR source_port XOR(dst_port bitshifted 16)XOR proto_number (1)
where Z is an arbitrary prime number, in this case 59. Using the above five tuple of 192.160.111.100/40111/71.100.122.70/71/6 and a Z value of 59, the hash generated is 189580069603, given that the values are used in host byte order. Bitshifting is performed in this example because the IP addresses are 32-bit while the port number is only 16-bit. In this exemplary hash function, using the bitshifting and the arbitrary prime number Z allows a higher likelihood of obtaining a unique hash.
The socket state information stored in the secondary memory is named with the given hash value. State information associated with the network socket and the network connection includes, but is not limited to, a source port, a destination port, connection established information, congestion window, Slow-Start Threshold (SSThresh) value, RTO state, a memory window size, negotiated options such as Selected Acknowledgement (SACK), maximum segment size (MSS), Window scaling, etc., as well as last sent/acked sequence number/acknowledgement number. The network connection hash is also stored in a lookup table in a primary memory, e.g., RAM memory of a blade server, and the lookup table is available to the IP-routing portion of the network stack. After serialization, all state information associated with the network socket and network connection is freed from the primary memory, thereby returning the RAM used to maintain that socket to the pool of free RAM that is available to the server for other purposes. Examples of storing network socket information when not in use according to various embodiments are described below in more detail with respect to
According to an embodiment,
According to an embodiment,
According to an embodiment,
In
According to an embodiment, as described above, another trigger for storing network socket state information is memory pressure. Memory pressure can be described as an amount of free space remaining in a memory. A threshold y can be determined either as a percentage, e.g., ten percent or ten percent below Linux limits, or an amount of memory that when reached could be the trigger for storing network socket state information. This trigger can be used in conjunction with a network socket inactivity timer or by itself, as shown in the previous embodiments. Additionally, when serialization is triggered due to a detection by the server that a memory pressure threshold has been exceeded, a least recently used network socket can be serialized or the inactivity time threshold can be reduced from ‘x’ to another value ‘z’ which is less then ‘x’ to increase the storage of sockets and free up more RAM.
As mentioned previously, serialization, described above with respect to
According to an embodiment,
The de-multiplexing which is generally described above with respect to
Alternatively, step 404 can be performed first by implementing the flow as follows. Firstly, for each packet that enters the system, perform the standard de-multiplexing procedure to search for an active TCP socket associated with the five tuple. If no network socket is found for the specific connection identifier, compute the hash using the connection five-tuple. Then determine if the hash is present in the lookup table of serialized network socket identifiers in the primary memory. If the hash is not present, continue with a standard de-multiplexing procedure. If the hash is present, initiate de-serialization of the saved network socket state associated with the hash. When de-serialization is complete, forward the packet to the activated network socket.
According to an embodiment, when a socket is de-serialized, the associated hash is removed from the lookup-table and the data stored in the secondary memory is marked as “dirty”. A separate garbage collection process clears the unused data from the secondary memory.
According to an embodiment there is a method for handling network connections as shown in
Embodiments described above can be implemented in a device, e.g., the blade server, to improve memory usage via network socket handling. An example of such a blade server is shown in
Implementing the various embodiments allows for a better utilization of RAM memory for active network connections, instead of inactive network connections, as well as timely control of when network sockets should be serialized and which sockets to choose for serialization based for example on either a least recently used algorithm or decision criteria on specific IP ranges. For example, the decision criteria could be implemented in any or all of steps 116, 202 and/or 303, from
The disclosed embodiments provide methods and devices for handling large amounts of parallel network connections with a limited amount of RAM. It should be understood that this description is not intended to limit the invention. On the contrary, the embodiments are intended to cover alternatives, modifications and equivalents, which are included in the spirit and scope of the invention. Further, in the detailed description of the embodiments, numerous specific details are set forth in order to provide a comprehensive understanding of the claimed invention. However, one skilled in the art would understand that various embodiments may be practiced without such specific details.
As also will be appreciated by one skilled in the art, the embodiments may take the form of an entirely hardware embodiment or an embodiment combining hardware and software aspects. Further, portions of the embodiments, e.g., the predetermined thresholds or rules to determine the thresholds for x and y, may take the form of a computer program product stored on a computer-readable storage medium having computer-readable instructions embodied in the medium. Any suitable computer-readable medium may be utilized, including hard disks, CD-ROMs, digital versatile disc (DVD), optical storage devices, or magnetic storage devices such as floppy disk or magnetic tape. Other non-limiting examples of computer-readable media include flash-type memories or other known memories.
Although the features and elements of the present embodiments are described in the embodiments in particular combinations, each feature or element can be used alone without the other features and elements of the embodiments or in various combinations with or without other features and elements disclosed herein. The methods or flowcharts provided in the present application may be implemented in a computer program, software or firmware tangibly embodied in a computer-readable storage medium for execution by a specifically programmed computer or processor.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SE2016/050703 | 7/8/2016 | WO | 00 |