The invention relates to filtering calls in system area networks.
System area networks (SANs) provide network connectivity among nodes in server clusters. Network clients typically utilize Transmission Control Protocol/Internet Protocol (TCP/IP) to communicate with the application nodes. Application node operating systems are responsible for processing TCP/IP packets.
TCP/IP processing demand at the application nodes, however, can slow system operating speeds. To address this, TCP/IP processing functions can be offloaded to remote TCP/IP processing devices. Legacy applications may use remote procedure call (RPC) technology using non-standard protocols to off-load TCP/IP processing.
The computer system 10 of
The network nodes 16a . . . 16k are platforms that can provide an interface between the network clients 12 and the SAN 14. The network nodes 16a . . . 16k may be configured to perform load balancing across multiple proxy nodes 18a . . . 18k. The proxy nodes 18a . . . 18k are platforms that can provide various network services including network firewall functions, cache functions, network security functions, and load balancing logic. The proxy nodes 18a . . . 18k may also be configured to perform TCP/IP processing on behalf of the application nodes 20a, 20b, 20c . . . 20k. The application nodes 20a, 20b, 20c . . . 20k are platforms that function as hosts to various applications, such as a web service, mail service, or directory service. The application nodes 20a, 20b, 20c . . . 20k may, for example, include a computer or processor configured to accomplish the tasks described herein.
SAN channels 24 interconnect the various nodes. SAN channels 24 may be configured to connect a single network node 16a . . . 16k to multiple proxy nodes 18a . . . 18k, to connect a single proxy node 18a . . . 18k to multiple network nodes 16a . . . 16k and to multiple application nodes 20a, 20b, 20c . . . 20k, and to connect a single application node 20a, 20b, 20c . . . 20k to multiple proxy nodes 18a . . . 18k. The SAN channels 24 connect to ports at each node.
Network clients 12 utilize TCP/IP to communicate with proxy nodes 18a . . . 18k via network nodes 16a . . . 16k. A TCP/IP packet may enter the SAN 14 at a network node 16a and travel through a SAN channel 24 to a proxy node 18a. The proxy node 18a may translate the TCP/IP packet into a message based on a lightweight protocol. The term “lightweight protocol” refers to a protocol that has low operating system resource overhead requirements. Examples of lightweight protocols include Winsock-DP Protocol and Credit Request/Response Protocol. The lightweight protocol message may then travel through another SAN channel 24 to an application node 20a.
Data can also flow in the opposite direction, starting, for example, at the application node 20a as a lightweight protocol message. The lightweight protocol message travels through a SAN channel 24 to the proxy node 18a. The proxy node 18a translates the lightweight protocol data into one or more TCP/IP packets. The TCP/IP packets then travel from the proxy node 18a to a network node 16a through a SAN channel 24. The TCP/IP packets exit the SAN 14 through the network node 16a and are received by the network clients 12.
A stream socket filter 34 transparently intercepts application socket API calls and maps them to lightweight protocol messages communicated to proxy nodes 18a . . . 18k. The stream socket filter 34 provides a technique for applications in application nodes 20a, 20b, 20c . . . 20k to communicate with network clients 12, located external to the SAN 14, via the proxy nodes 18a . . . 18k and the network nodes 16a . . . 16k. The stream socket filter 34 is typically event-driven. A single lightweight protocol message sent or received by the stream socket filter 34 can serve more than one sockets API call. Thus, unnecessary round-trips may be minimized for calls that do not generate any network events. The stream socket filter 34 may reside between an application and a legacy network stack. The stream socket filter 34 may be implemented as a dynamically loadable library module (where supported by the operating system), or as a statically linked library (where recompilation of the source is possible).
The SAN Transport 36, Virtual Interface Provider Library (VIPL) 38, and the Network Interface Card (NIC) 40 are standard components that allow the application node 20a to perform lightweight protocol-based communications.
In legacy applications, sockets are software endpoints used for communications between application nodes 20a, 20b, 20c . . . 20k and network clients 12. Sockets may be opened either actively or passively on an associated file descriptor (socket).
Applications 30 issue requests for actions to take place in the form of calls issued on a file descriptor. As shown in
The stream socket filter 34 determines whether a network event should be generated (block 52) by considering the call issued and the file descriptor. As illustrated in
As shown in
Traditional file descriptors that are assigned by the operating system lie in the range between zero and FD_SETSIZE−1, which typically has the value of 1023. File descriptors between the value of FD_SETSIZE−1 and 65535 are typically available for use by the proxy node 18a to communicate with the stream socket filter 34.
A socket( ) call in an application typically returns a file descriptor 80 whose value is provided by the application node operating system 26a, 26b, 26c . . . 26k. This file descriptor may be bound to a well-known port for listening on a connection. If this happens, the file descriptor is then categorized as a service file descriptor 98. Service file descriptors 98 may be used to distinguish between different service sessions between an application node 20a and a proxy node 18a. The operating system may also assign file descriptors known as mapped file descriptors 99. Any other file descriptors in the OS-assigned range that are not service file descriptors 98 or mapped file descriptors 99 may typically be used for file input/output or network input/output related functions, usually unrelated to the proxy node 18a or SAN transport 36 functions.
The stream socket filter 34 may use transport file descriptors 94 for both actively and passively opened stream sockets. For passively opened TCP-related sockets, a flow identifier (“flow id”) supplied by a proxy node 18a may be returned by the accept( ) call as the file descriptor to be used by the application 30. The file descriptor returned is actually a transport file descriptor 94 taking on the value of the flow id associated with that particular flow. Some applications (e.g. File Transfer Protocol servers) make a connect( ) call to a network client 12 to actively open a socket on the application node 20a. Since the application node operating system 26a typically generates the file descriptor prior to connection establishment, the file descriptor typically needs to be mapped to a transport file descriptor 94 when the connection is finally established. The application may use the operating system 26a assigned mapped file descriptors 99, whereas the stream socket filter 34 may use the corresponding transport file descriptors 99 for communication.
The stream socket filter 34 recognizes which of the categories (system, service, mapped or transport) a particular file descriptor falls under. Based on that categorization and based on the particular call issued, the stream socket filter 34 determines whether a communication with a proxy node 18a is necessary.
As shown in
An application 30 on an application node 20a typically starts a service with a socket( ) call. An endpoint is then initialized. If an application 30 issues a bind( ) call followed by a listen( ) call, the stream socket filter 34 notes the service file descriptor 98 and then sends a JOIN_SERVICE message containing the service file descriptor 98 to the proxy node 18a indicating that the application 30 is ready to provide application services. The application 30 then waits for a network client's 12 request via a select( ) or an accept( ) call. The stream socket filter 34 intercepts the select( ) or accept( ) call and waits for the arrival of a CONNECTION_REQUEST message from the proxy node 18a. The CONNECTION_REQUEST message typically arrives with a flow id assigned by the proxy node 18a, which is then returned to the application 30 in response to the accept( ) call. The application 30 may then use the returned flow id as the transport file descriptor 99 for subsequent reading and writing of data.
The stream socket filter 34 may map read and write calls from the application 30 onto DATA messages. If an application 30 finishes its data transfer on a particular transport file descriptor 94, it typically invokes a close( ) call, which the stream socket filter 34 will translate to a CLOSE_CONNECTION message that is sent to the proxy node 18a. When the application 30 is ready to shutdown its services, it invokes a close( ) call on a service file descriptor 98, which the stream socket filter 34 recognizes, triggering a LEAVE_SERVICE message to be sent to the proxy node 18a, and terminating the services.
Not all application calls generate communication messages. Calls that do not require generating lightweight protocol messages (e.g., socket ( ) and bind ( ) calls) may be processed locally.
Systems implementing the techniques described herein are also capable of implementing techniques for error handling, parameter validation, address checking, as well as other standard techniques.
Systems implementing the foregoing techniques may realize faster SAN 14 operating speeds and improved system flexibility. The techniques described herein may alleviate operating system legacy networking protocol stack on servers bottlenecking for inter-process communication (IPC) in a SAN. Operating system related inefficiencies incurred in network protocol processing, such as user/kernel transitions, context switches, interrupt processing, data copies, software multiplexing, and reliability semantics may be minimized, and may result in an increase in both CPU efficiency and overall network throughput. With TCP/IP processing offloaded to proxy nodes 18a . . . 18k, a lightweight protocol based on SAN Transport 36 may be used in the SAN 14 and may reduce processing overheads on application servers. The stream socket filter 34 may enable legacy applications that use socket-based networking API to work in a SAN 14 and/or network with non-legacy communication protocols, in conjunction with proxy nodes 18a . . . 18k.
Various features of the system may be implemented in hardware, software or a combination of hardware and software. For example, some aspects of the system can be implemented in computer programs executing on programmable computers. Each program can be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. Furthermore, each such computer program can be stored on a storage medium, such as read-only-memory (ROM) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage medium is read by the computer to perform the functions described above.
Other implementations are within the scope of the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5485460 | Schrier et al. | Jan 1996 | A |
5568487 | Sitbon et al. | Oct 1996 | A |
5610905 | Murthy et al. | Mar 1997 | A |
5619650 | Bach et al. | Apr 1997 | A |
5721872 | Katsuta | Feb 1998 | A |
5721876 | Yu et al. | Feb 1998 | A |
5941988 | Bhagwat et al. | Aug 1999 | A |
6003084 | Green et al. | Dec 1999 | A |
6006268 | Coile et al. | Dec 1999 | A |
6075796 | Katseff et al. | Jun 2000 | A |
6081900 | Subramaniam et al. | Jun 2000 | A |
6115384 | Parzych | Sep 2000 | A |
6131163 | Wiegel | Oct 2000 | A |
6145031 | Mastie et al. | Nov 2000 | A |
6151688 | Wipfel et al. | Nov 2000 | A |
6154743 | Leung et al. | Nov 2000 | A |
6192362 | Schneck et al. | Feb 2001 | B1 |
6212550 | Segur | Apr 2001 | B1 |
6347337 | Shah et al. | Feb 2002 | B1 |
6438652 | Jordan et al. | Aug 2002 | B1 |
6460080 | Shah et al. | Oct 2002 | B1 |
6480901 | Weber et al. | Nov 2002 | B1 |
6493343 | Garcia et al. | Dec 2002 | B1 |
6546423 | Dutta et al. | Apr 2003 | B1 |
6549937 | Auerbach et al. | Apr 2003 | B1 |
H2065 | Hong et al. | May 2003 | H |
6609148 | Salo et al. | Aug 2003 | B1 |
6614808 | Gopalakrishna | Sep 2003 | B1 |
6615201 | Seshadri et al. | Sep 2003 | B1 |
6625258 | Ram et al. | Sep 2003 | B1 |
6665674 | Buchanan et al. | Dec 2003 | B1 |
6694375 | Beddus et al. | Feb 2004 | B1 |
6708223 | Wang et al. | Mar 2004 | B1 |
6711162 | Ortega et al. | Mar 2004 | B1 |
6721334 | Ketcham | Apr 2004 | B1 |
6745243 | Squire et al. | Jun 2004 | B1 |
6754696 | Kamath et al. | Jun 2004 | B1 |
6882654 | Nelson | Apr 2005 | B1 |
20010034782 | Kinkade | Oct 2001 | A1 |
20020055993 | Shah et al. | May 2002 | A1 |
20020059451 | Haviv | May 2002 | A1 |
20020072980 | Dutta | Jun 2002 | A1 |
20020099827 | Shah et al. | Jul 2002 | A1 |
20020099851 | Shah et al. | Jul 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
20020099827 A1 | Jul 2002 | US |