The invention is based on a priority application EP 04291647.8 which is hereby incorporated by reference.
The present invention relates to a method for establishing a bi-directional peer-to-peer communication link between at least two user agents within a call-based environment. The setup of the communication link is established by means of signaling messages to be exchanged between the at least two user agents before establishing the communication link. The call-based environment comprises a network, the at least two user agents each connected to the network via a network address translation device and at least two call servers each connected to at least one of the user agents.
A network address translation device is used for translating a private identifier of a User Agent into a public identifier or for translating a public identifier into a private identifier. The private identifier is used only within a private domain comprising the User Agent, the call server and the network address translation device and cannot be routed and addressed through public networks. The public identifier is used in the public domain and can be routed and addressed through public as well as private networks. For example, the identifiers comprise an Internet Protocol (IP)-Address and a User Datagram Protocol (UDP)-port. The public domains only have a restricted number of public identifiers. By translating the identifiers one public identifier can be used for a number of private identifiers. Hence, a network address translation device allows the connection operation of a higher number of user agents to the public network.
The present invention can be used for Voice over Internet Protocol (VoIP)- and Next Generation Network (NGN)-Systems using the Session Initiation Protocol (SIP) for the setup of bi-directional peer-to-peer communication links with interposed Firewalls and Network Address (and Port) Translation devices, below referred to as FW/NA(P)T-devices. The communication links can be used for transmitting voice data and/or any kind of multi-media data.
In the state of the art, for the setup of a bi-directional User Datagram Protocol (UDP)-based peer-to-peer communication link, for example, SIP-messages are used. These SIP-messages for their part are transported using UDP-packets. In their so-called Session Description Protocol (SDP) descriptors, these SIP-messages contain information about the calling device, i.e. the device that initiated the setup of the communication link, (User Agent Client, UAC) and about the receiving device, i.e. the recipient of the communication link (User Agent Server, UAS), describing the used Internet Protocol (IP)-addresses and User Datagram Protocol (UDP)-ports for the Real-Time Transport Protocol (RTP)-media flows.
For terminals located in private IP-realms behind a commonly used device with “Firewall” and “Network Address (and Port) Translation” functionality, below referred to as FW/NA(P)T-functionality, the addresses given in the SDP description are normally private or local addresses, i.e. not publicly addressable. These private addresses cannot be tracked and used by corresponding call servers on the other side of the FW/NA(P)T-device.
Because common standard FW/NA(P)T-devices at the border between private and public IP-realms operate on Open Systems Interconnection (OSI) layer 3 and/or 4 only, they are not aware of SIP/SDP-parameters, which are contained in the UDP payload. This means, the FW/NA(P)T-devices only translate addresses and ports in the UDP/IP-header. By leaving the IP-addresses and UDP-ports within the SIP/SDP-messages unchanged, a FW/NA(P)T-device generates the problem that UAC and UAS exchange private IP-addresses/UDP-ports for the RTP-media session, which are not addressable or routable through public IP-networks.
To solve this problem, a terminal in a private network behind a FW/NA(P)T-device needs to get the information, how the local or private IP-address and the UDP-port he wants to use for an RTP-connection is mapped/bound by the FW/NA(P)T-device to a public IP-address and/or UDP-port. For the setup of a bi-directional peer-to-peer communication, this public IP-address and UDP-port must then be used in the SIP/SDP-message.
There are some approaches known in the art for solving that problem. One possible solution is referred to as Traversal Using Relay Network Address Translation (TURN). Another possible solution is referred to as Simple Traversal of User Datagram Protocol (STUN). Both TURN and STUN have been presented by the Internet Engineering Task Force (IETF). However, these solutions require the installation of additional hardware/servers in the public IP-domain. Additional special, standardized, and parameterized intelligence is required in the user agents (e.g. SIP-phones) for exploration of the NA(P)T-binding information of the IP/UDP. The solution based on TURN is hardly scalable because all signaling and media traffic has to pass through one single server. Furthermore, the known solutions can cover only part of the vast NA(P)T-functionality, which, for example, can be “Full Cone NAT”, “Restricted Cone NAT”, “Port Restricted Cone NAT”, “Symmetric NAT”, etc.
Moreover, other solutions are suggested in the art for controlling the FW/NA(P)T-devices by an application layer entity using control interfaces and protocols like MEGACO, MIDCOM, FCP, UpnP, etc. However, these solutions require on the one hand an upgrade or exchange of a large number of already installed FW/NA(P)T-devices, and on the other hand the installation of centralized or decentralized control entities.
Therefore, it is an object of the present invention to provide SIP-awareness for FW/NA(P)T-devices disposed between a private domain and a public domain of a call-based environment.
This object is solved by a method of the above-mentioned kind, characterized in that before the communication link between the user agents is established, explore messages are exchanged between the call servers via the FW/NA(P)T-devices. The user agents are connected to the public network via the FW/NA(P)T-devices and also directly connected to their respective call servers. Translation information of the FW/NA(P)T-devices is extracted from the explore messages. The content of the signaling messages is modified according to the translation information. Finally, the communication link is established by means of the modified signaling messages.
According to the present invention, the intelligence for acquiring the NA(P)T-binding information is located in the call servers. That has the advantage, that user agents (e.g. IP-phone, SIP-phone, etc.) and the FW/NA(P)T-devices of the call-based environment can remain unchanged and that additional devices are not required. Only the call servers have to be adapted slightly in order to send explore messages and to modify signaling messages.
Advantageous embodiments of the invention can be taken from the depending claims.
The present invention can be used for a Network Address (and Port) Translation (NA(P)T) traversal in a Session Initiation Protocol (SIP) environment. The invention provides a method to explore NA(P)T-binding information of common FW/NA(P)T-devices by extending the functionality of the call servers (e.g. SIP proxy servers), which are involved in the call setup anyway. Exploration of all NA(P)T-bindings along the media path is done by one, two or more involved SIP proxy servers, which are located in the various IP realms (private UAC realm, private UAS realm, private NAP realm, public realm, etc.).
Taking a short break during a session initiation or a call setup with SIP/SDP-messages, the involved SIP proxy servers start the exploration by exchanging special UDP messages (which could be in the SIP-format), pretending to be the user agent by faking the IP source Address (IP-SA) and UDP source port (UDP-SP). SA and SP are the private IP-addresses and UDP ports to the User Agents, for which the NA(P)T-bindings have to be explored. Faking the UDP-packet's source address and port is no legal problem due to the fact that it takes place within a private IP-domain.
The SIP proxy servers also have knowledge about their public IP addresses (by the Domain Name Service, DNS, where they have to register anyway), and about the existence of NA(P)T-functionality within the media path and the necessity of exploration of NA(P)T-bindings. Of course, the FW/NA(P)T-device must be enabled to forward UDP-messages sent from outside (inbound) at an assigned default SIP-signaling port (e.g. port 5060) to SIP proxy server in a private IP-realm.
After exploration of the binding information for each FW/NA(P)T-device along the media path, the SIP proxy server finishes the SIP call setup procedure. All SIP/SDP-parameters are replaced with the correct IP-addresses/UDP-ports as seen from the corresponding User Agent on the other side of the FW/NA(P)T-device(s).
The present invention provides a method to explore NA(P)T-transformation information (bindings) of common Standard FW/NA(P)T-devices. Operating at the border between private and public IP realms on OSI layer 3 and/or 4 only, the Standard FW/NA(P)T-devices are not aware of SIP/SDP-parameters, which are contained in the UDP payload. Therefore, a call server in a private network behind the FW/NA(P)T-device gets the information, how the local IP-address and UDP-port the user agent wants to use for an RTP-connection, are mapped/bound by the FW/NA(P)T-device to a public IP-address and UDP-port, in order to use this public IP-addresses and UDP-ports in the SDP-descriptor part of the SIP messaging for setup of a bi-directional peer-to-peer communication.
Exploration of NA(P)T-bindings is done by using SIP proxy servers in private domains. These SIP proxy servers have to be able to create UDP-messages, which could be in the SIP-format, with faked IP source address and UDP source port (SA/SP) between each other. These UDP-messages are transported by the SIP-proxy servers via the FW/NA(P)T-device, where they are modified This means, SA and SP of the created messages have to be the local or private IP addresses and UDP ports of that user agent, for which the NA(P)T-bindings have to be explored. The SIP proxy servers have knowledge about their public IP addresses, which they can receive, e.g. from the public DNS-server, where they have to be registered anyway. With this information the SIP proxy servers also have knowledge about the existence of FW/NA(P)T-functionality within the media path and the necessity of exploration of NA(P)T-bindings.
Of course, the FW/NA(P)T-device must be able to forward UDP-messages sent from outside (inbound) at an assigned default SIP signaling port (e.g. port 5060) to SIP proxy server in private IP realms.
The method to explore NA(P)T-bindings according to the present invention also works for cascaded FW/NA(P)T-devices.
Further advantages and preferred embodiments of the invention are shown in the drawings and are explained in detail hereinafter making reference to the drawings. Of course, the present invention is not restricted to the preferred embodiments shown in the drawings.
In the figures, a first private domain has the reference number 1, a second private domain has the reference number 2, a third private domain (see
Referring to
The first call server PS1 removes the SDP descriptor, stores it and forwards the first signaling message “INVITE” without the SDP descriptor via a first Network Address (and Port) Translation device (NA(P)T1), the network, a second Network Address (and Port) Translation device (NA(P)T2), a second call or proxy server PS2 to a second user agent UA2 (SIP2, SIP3, SIP4, SIP5 in the
The second user agent UA2 receives the first signaling message “INVITE” (SIP5) and sends a second signaling message “200OK” comprising a private or local identification or SDP descriptor of the second user agent UA2. The second signaling message “200OK” is a response to the first message “INVITE”. The SDP descriptor contains the private identification of the second user agent, i.e. a second IP address (IP310) and a second UDP port (UDP9000). This is where the second user agent UA2 is listening or waiting for RTP-traffic (RTP1) from the first user agent UA1. The second signaling message “200OK” is sent from the second user agent UA2 to the second call server PS2 (SIP6 in the
The first call server PS1 intercepts forwarding of the second signaling message “200OK” after receiving the message (SIP9). This is the point where according to a preferred embodiment of the present invention the NA(P)T-bindings are explored by the call servers PS1 and PS2. For exploring the NA(P)T-bindings the call servers PS1 and PS2 exchange so-called explore messages. Below the exploration of the NA(P)T-bindings is described in more detail for two unidirectional Real-Time Transport Protocol (RTP)-media channels (RTP1 and RTP2).
The exploration is accomplished using special signaling (UDP) messages, for example in the SIP-format, which are exchanged between the two call servers (PS1 and PS2). The explore messages contain information concerning the source of the call in terms of IP-source address (IP-SA) and UDP-source port (UDP-SP). Instead of the source information of the call servers (PS1 and PS2), which send the explore messages, the call servers (PS1 and PS2) insert the source information of the first and second user agents (UA1 and UA2) respectively into the explore messages. Further, the explore messages contain information concerning the user agent to be called and the corresponding SIP-proxy, respectively, in terms of IP-destination address (IP-DA) and UDP-destination port (UDP-DP). The source and destination information is contained in the IP-header of the explore messages. The source information (IP-SA, UDP-SP) is also contained in the payload (SDP of the explore message).
A first explore message is sent from the first call server PS1 to the first FW/NA(P)T-device NA(P)T1 (E11 in the
The first FW/NA(P)T-device NA(P)T1 receives the first explore message and translates the IP-SA and the IP-SP from the private identification into the public identification of the first user agent UA1, which in the example would be IP210 and UDP2000. This is a dynamic binding created with the first passing of the UDP-data packet. The destination address (IP-DA), the destination port (UDP-DP) and the payload remain unaffected from the translation. Hence, the payload of the message still contains the SIP-message with the private identification of the first user agent UA1. Then, the translated message is transmitted to the second FW/NA(P)T-device NA(P)T2 via the network (E12 in the
The second FW/NA(P)T-device NA(P)T2 receives the translated first explore message and translates the IP-DA and the IP-DP into the private or local identification of the second user agent UA2 and the corresponding SIP-proxy respectively, which in the example would be IP320 and UDP5060. The source address (IP-SA), the source port (UDP-SP) and the payload remain unaffected from the translation. Hence, the payload of the message still contains the SIP-message with the private identification of the first user agent UA1. Then, the message is transmitted to the second call server PS2 (E13 in the
Receiving the message (E13) or the data packet respectively, the second call server PS2 will extract the NA(P)T-binding information for the first FW/NA(P)T-device NA(P)T1 contained therein:
NA(P)T1: IP110/UDP1000⇄IP210/UDP2000
This means, sending RTP data packets (RTP2) to the public IP address/port IP210/UDP2000 at the first FW/NA(P)T-device NA(P)T1 will be forwarded to the first user agent UA1 at IP110/UDP1000.
A second explore message is sent from the second call server PS2 to the second FW/NA(P)T-device NA(P)T2 (E14 in the
The second FW/NA(P)T-device NA(P)T2 receives the second explore message and translates the IP-SA and the IP-SP from the private identification into the public identification of the second user agent UA2, which in the example would be IP220 and UDP8000. This is a dynamic binding created with the first passing of the UDP-data packet. The destination address (IP-DA), the destination port (UDP-DP) and the payload remain unaffected from the translation. Hence, the payload of the message still contains the SIP-message with the private identification of the second user agent UA2. Then, the translated message is transmitted to the first FW/NA(P)T-device NA(P)T1 via the network (E15 in the
The first FW/NA(P)T-device NA(P)T1 receives the translated second explore message and translates the IP-DA and the IP-DP into the private or local identification of the first user agent UA1 and the corresponding SIP-proxy respectively, which in the example would be IP120 and UDP5060. The source address (IP-SA), the source port (UDP-SP) and the payload remain unaffected from the translation. Hence, the payload of the message still contains the SIP-message with the private identification of the second user agent UA2. Then, the message is transmitted to the first call server PS1 (E16 in the
Receiving the message (E16) or the data packet respectively, the first call server PS1 will extract the NA(P)T-binding information for the second FW/NA(P)T-device NA(P)T2 contained therein:
NA(P)T2: IP310/UDP9000⇄IP220/UDP8000
This means, sending RTP data packets (RTP1) to the public IP address/port IP220/UDP8000 at the second FW/NA(P)T-device NA(P)T2 will be forwarded to the second user agent UA2 at IP310/UDP9000.
Accordingly, after exchange of the explore messages, the first call server PS1 has the binding-information of the second FW/NA(P)T-device NA(P)T2 and the second call server PS2 has the binding-information of the first FW/NA(P)T-device NA(P)T1. This information can be used by the call servers PS1 and PS2 to replace the private or local information of the user agents UA1 and UA2 contained in the signaling messages with the corresponding public information. By doing so, the signaling messages, which now contain the public information of the transmitting user agent UA1 and UA2, can be understood and properly interpreted by the receiving call server PS2 and PS1.
Continuing with the first call server PS1 intercepting the forwarding of the second signaling message “2000K” after receiving the message (SIP9), the first call server PS1 replaces the private or local information of the second user agent UA2, i.e. the private SDP, (IP310/UDP9000) in the received signaling message “200OK” (SIP9) with the public information, i.e. the public SDP, for the second user agent UA2 (IP220/UDP8000) and sends the message with the public SDP to the first user agent UA1 (SIP10 in the
The first user agent UA1 sends a further signaling message “ACK” (acknowledge for the second signaling message “200OK”) via the first and second call server PS1 and PS2 to the second user agent UA2.
The first call server PS1 intercepts the further signaling message “ACK” and inserts the private SDP descriptor (IP110/UDP1000)of the first user agent UA1, which—as described above—was previously stored after having received the first signaling message “INVITE” from the first user agent UA1 (SIP1 in the
The second call server PS2 intercepts the further signaling message “ACK” and replaces the private or local information (IP110/UDP1000) of the first user agent UA1 contained in the payload of the further signaling message “ACK” with the explored public information, i.e. the public SDP descriptor (IP210/UDP2000), for the first user agent UA1. The modified further signaling message “ACK” with the public SDP of the first user agent UA1 is forwarded to the second user agent UA2, which can properly interpret the modified signaling message.
Now the first and second user agents UA1 and UA2 can start Real-Time Transport Protocol (RTP)-traffic. The data transmission connection which was built up in the above-described way, can be used for various applications, for example multi-media-data-exchange- and Voice-over-IP-applications (VoIP). In the case of VoIP-applications, the data transmission connection could be built up by means of SIP signaling messages. The user agents could be SIP-phones.
If necessary, explored Real-Time Transport Control Protocol (RTCP)-binding information can be inserted at the call servers PS1 and PS2 in the same way as is done with the RTP-binding information. The method to explore FW/NA(P)T-device (NA(P)T)-bindings as described above also works very well with cascaded NA(P)T-devices, as shown in
As mentioned before, each IP-domain with a NA(P)T-device needs its own call server, so each NA(P)T-device knows about the “public” IP-addresses behind the NA(P)T-device.
This means, the first call server PS1 registers at the second call server PS2 and gets the “public” IP-address IP210, the second call server PS2 registers at the Domain Name Service (DNS) and receives the public IP-address IP410. The first and second call server PS1 and PS2 can use these IP-addresses (IP210 and IP410) for the exploration method described above.
The NA(P)T can also incorporate a firewall functionality. In that case, the NA(P)T would be referred to as a Firewall Network Address (and Port) Translation device (FW/NA(P)T). Of course, when building up data transmission connections between two user agents disposed within the same private domain, the signaling messages would probably not be transmitted across the FW/NA(P)T-device and the network, but would rather remain within the realm of the private domain. It would not be necessary to make use of the present invention to initiate a data transmission connection between two user agents of the same private domain, because SIP-signaling would work well without, too. However, the present invention would also work very well if both user agents were disposed within the same private domain and the signaling messages were sent across the FW/NA(P)T-device and the network. In that case, the first and second private domain described above would be the same private domain and the first and second call servers would be the some device. Also, the first and second FW/NA(P)T-devices would be the same device. Both user agents UA1 and UA2 would be connected to the same call server PS and to the same NA(P)T-device.
The present invention suggests an easy way to explore NA(P)T-binding information without the need for additional equipment (servers) or the requirement to upgrade or exchange a large number of already installed NA(P)T-devices, as other approaches require. The main advantages are the following:
The payload of OSI-layer 5 has the reference number 109 and comprises, for example, information concerning the local or private IP-address and UDP-port of the calling user agent. The header 105 of OSI-layer 5 comprises, for example, information concerning the calling user agent (e.g. ‘from’) and concerning the user agent to be called (e.g. ‘to’) or media-session-information.
The payload of OSI-layer 4 has the reference number 108 and comprises the header 105 and the payload 109 of the OSI-layer 5. The header 104 of OSI-layer 4 comprises, for example, UDP-ports or other information concerning the calling user agent and concerning the user agent to be called. The payload of OSI-layer 3 has the reference number 107 and comprises the header 104 and the payload 108 of the OSI-layer 4. The header 103 of OSI-layer 3 comprises, for example, IP-addresses or other information concerning the calling user agent and concerning the user agent to be called. The payload of OSI-layer 2 has the reference number 106 and comprises the header 103 and the payload 107 of the OSI-layer 3. The header 102 of OSI-layer 2 comprises, for example, Media Access Control (MAC)-addresses or other information concerning the calling user agent and concerning the user agent to be called.
Standard NA(P)T-devices only translate the source and destination addresses and ports of the layer 3 and layer 4 part of the signaling message 100. The destination addresses and ports of the layer 5 part of the signaling message 100 remain unaffected by the translation process accomplished by the NA(P)T-devices. This has the disadvantage that, for example, the build-up of a communication link by means of layer 5 (e.g. SIP) signaling messages will not work if the communication link runs across standard NA(P)T-devices. The present invention suggests an easy, convenient and low-cost way for performing a layer 5 (e.g. SIP)-awareness of standard NA(P)T-devices. This enables build-up of a communication link running across standard NA(P)T-devices by means of layer 5 (e.g. SIP) signaling messages.
The exploration messages for exploring the bindings of the NA(P)T-devices for two unidirectional RTP media channels (RTP1 and RTP2), for example UDP (SIP)-messages, are described by way of example in more detail below:
E11: Header: IP-SA/UDP-SP: IP110/UDP1000 IP-DA/UDP_DP: IP220/UDP5060
E12: Header: IP-SA/UDP-SP: IP210/UDP2000 IP-DA/UDP_DP: IP220/UDP5060
E13: Header: IP-SA/UDP-SP: IP210/UDP2000 IP-DA/UDP_DP: IP320/UDP5060
E14: Header: IP-SA/UDP-SP: IP310/UDP9000 IP-DA/UDP_DP: IP210/UDP5060
E15: Header: IP-SA/UDP-SP: IP220/UDP8000 IP-DA/UDP_DP: IP210/UDP5060
E16: Header: IP-SA/UDP-SP: IP220/UDP8000 IP-DA/UDP_DP: IP120/UDP5060
Number | Date | Country | Kind |
---|---|---|---|
04291647.8 | Jun 2004 | EP | regional |