The present invention relates to an NAT (Network Address Translator) traversal under TCP, and more particularly to an NAT traversal for Real Time Streaming Protocol (RTSP) in order to improve the problem that multimedia audio/video messages cannot transmit each other when RTSP media server and client are both under NAT firewall.
Nowadays IP Camera is one of the popular “Internet of Things”. Most of the IP camera use Real Time Streaming Protocol (RTSP) due to the fact that RTSP complies with one-way audio/video communication and streaming condition. In a standard RTSP Internet environment, TCP (Transmission Control Protocol) is the major protocol for transmitting multimedia data, but more and more people set up NAT (Network Address Translator, commonly known as IP router) so as to cause the IP Camera and the client are both under the NAT, therefore IP Camera and the client cannot exchange RTSP messages, and even video/audio RTP packet cannot transmit through TCP directly.
A basic procedure of a conventional RTSP for browser application is shown in
Conventional RTSP requires that the media server 2167 must be a real IP in order to execute the aforementioned basic procedure. If the media server 2167 is a mobile small media server such as IP Camera, the IP Camera may under an IP router (NAT), so the media server will have a virtual IP. If the client is also under an IP router (NAT), RTSP communication for both sides will have problem due to the real IP and port number are unknown to both sides, therefore peer to peer transmission for media packet cannot be achieved.
The present invention provides an NAT traversal under TCP for RTSP, the RTSP includes a Login Session, a CallSetup Session, a Media Session and a Cancel Session, and includes a first NAT, a second NAT, an RTSP proxy server, an IE browser (client) is under the first NAT, an IP camera (media server) is under the second NAT; comprising the steps as below:
Many users of Internet multimedia have the intention to control the playing of the media, especially those who like to use remote controller. They like to pause playing, forward or backward playing, fast forward when playing, fast backward when playing, etc, just like a user to use DVD player to watch movie or use CD player to listen music. In order to let the user to control playing, Real Time Streaming Protocol (RTSP) is used for exchange control messages for playing between the media playing program and the server. Packets in RTSP have two kinds: Request and Response. Request means an RTSP message from the client to the server to express the purpose of the client; while Response means an RTSP message from the server to the client to answer the request of the client.
RTSP defines 6 Requests, including SETUP, PLAY, PAUSE, TEARDOWN, OPTIONS and DESCRIBE, as shown in Table 1.
RTSP Response messages are messages from the server for responding the request of the client, as shown in Table 2.
Referring to
The CallSetup Session is the first session, IE browser (client) 2178 sends SETUP message to IP camera (media server) 2167, a 200 OK message is responded to the client 2178. When the client 2178 is going to play the media, the client 2178 will send PLAY to IP camera 2167, and a 200 OK message is responded to the client 2178.
Thereafter, the client 2178 and IP camera 2167 will enter Media Session, IP camera 2167 sends audio/video media directly to the IE browser of the client 2178.
When the client 2178 is going to stop the audio/video media from IP camera 2167, the client 2178 will send TEARDOWN to IP camera 2167, and then a 200 OK message is responded to the client 2178 to enter the Cancel Session.
Referring to
The client 2178 sends a “SYN” message to the server 2167 to inform the server 2167 for connecting. After the server 2167 is ready, the server 2167 will return a “SYNACK” message to inform the client 2178 “ready for connecting”. Thereafter, the client 2178 will send an “ACK” message to inform the server 2167 “start transmission”, therefore “Three-way Handshaking” is achieved, a TCP channel is set up.
Since TCP connecting is a public standard procedure, the API of TCP will not allow any designer to revise the “Three-way Handshaking”. All actions of the “Three-way Handshaking” are accomplished by the operating system.
Referring to
Referring to
Referring to
When the client (IE browser) 2178 is going to play the audio/video of the IP Camera 2167, the client 2178 will first predict the port number of NAT 1, and then send SETUP packet to RTSP proxy server 3. The SETUP packet will be first filled with the number 2178, the header is “SETUP 2178 RTSP/1.0”. After the RTSP proxy server 3 receives the SETUP packet, a source IP and port number of the packet will be checked and recorded. The source IP is the real IP address “140.124.40.155” of NAT1, the port number is the port number of NAT 1.
Thereafter, the RTSP proxy server 3 will responded with a 200 OK message to the client 2178, including the source IP and port number of NAT 1, as shown below:
Therefore, the client 2178 will know the port number of NAT 1 after receiving the 200 OK packet. The client 2178 will then send SETUP packet several times in order to detect the rule of the port number allocation.
After predicting the port number, the real IP (140.124.40.155) of the NAT 1 and the port number allocated to the IP camera 2167 are filled into the transport header of SETUP for sending to IP camera 2167, as shown below.
“SETUP 2167 RTSP/1.0” will be sent to RTSP proxy 3 through NAT 1, and then sent to IP camera 2167 through NAT 2. After IP camera 2167 receives messages, IP camera 2167 will also perform the same detecting procedure as the SETUP of the client 2178 for detecting the rule of the port number allocation of NAT 2 of the IP camera 2167.
After predicting the port number, IP camera 2167 will fill the real IP (126.16.64.4) of the NAT 2 and the port number allocated to the client 2178 into the transport header of 200 OK packet for sending to the client 2178, as shown below.
The 200 OK responding packet transmits messages to RTSP proxy server 3 through NAT 2, and then sends to the client 2178 through NAT 1.
After the client 2178 receives the 200 OK responding packet, an API connection of “Start TCP Client” will be started to connect with 126.16.64.4:(NAT 2 predicted port number). According to “Three-way Handshaking”, an SYN packet will be sent to the NAT 2 predicted port, but because packet in NAT 2 stays at NAT 2, the “Three-way Handshaking” will fail to get an ICMP packet. “Start TCP Client” of API responds an error message, so the client 2178 stop the connection of the socket immediately, and then start “Start TCP Client” again using the same port number to generate a “receiving socket”
Then the IP camera 2167 will follow the “Transport” in SETUP 2167 to “Start TCP Client” for connecting API to 140.124.40.155:(NAT 1 predicted port number). According to “Three-way Handshaking”, SYN packet will pass through NAT 1 predicted port of the client 2178. Since the last SYN for TCP connection from the client 2178 had left the NAT 1 port of the client 2178, and has been recorded in a table of NAT1, therefore a SYN packet from the IP camera 2167 for TCP connection can pass through the NAT 1 port to reach “receiving socket” of the client 2178, and finish “Three-way Handshaking”.
At this moment, a peer to peer TCP channel is set up, the client 2178 can then use the PLAY request to ask the IP camera 2167 to send out media packet and finish the NAT traversal.
The first embodiment is a preferred embodiment, but the predicting of the port number or the traversal will fail sometimes, in this condition, an RTP-Relay method and controlling the flow rate are used for implementing.
Referring to
The SETUP packet passes through NAT 1 to RTSP proxy server 3, and then RTSP proxy server 3 will modify the SETUP packet, the description of the Transport header will be changed into the form of RTP-Relay 4, as shown below:
The modified SETUP Packet is sent to NAT 2 of the IP camera 2167, and finally arrives at IP camera 2167. After receiving the SETUP, IP camera 2167 will respond with 200 OK message. An IP address (virtual IP) of the IP camera 2167 and the port number for transmitting media connection will be filled into the Transport header of the 200 OK message, as shown below:
The 200 OK packet passes through NAT 2 of the IP camera 2167 to RTSP proxy server 3, and then RTSP proxy server 3 will modify the 200 OK packet, the description of the Transport header will be changed into the form of RTP-Relay 4, as shown below:
The modified 200 OK packet passes through NAT 1 to the client 2178.
As the client 2178 plays the media, the client 2178 will send PLAY packet through RTSP proxy server 3 to IP camera 2167. After receiving the PLAY packet, IP camera 2167 will respond with 200 OK packet. When the client 2178 receives the 200 OK packet, the client 2178 will start TCP connection to RTP-Relay 4 according to the responding Transport in SETUP, i.e. connect to 202.145.2.1:1201. Therefore a pre-established media TCP channel between the NAT 1 of the client 2178 and RTP-Relay 4 is set up.
When IP camera 2167 starts transmitting streaming media data, the IP camera 2167 will also start TCP connection to RTP-Relay 4 according to the Transport of SETUP packet in CallSetup session, and transmit the streaming media data to 202.145.2.1:1200 one by one. Then RTP-Relay 4 starts to send media data to media TCP channel established between the NAT 1 of the client 2178 and RTP-Relay 4, and finally the streaming media data are sent to the client 2178.
However, it has disadvantage if only the RTP-Relay is used. Suppose that the bandwidth of audio for a media is 2 Mb/sec, expense per month is NT$20000, if there are 1 million users try to download the streaming media data from the media server simultaneously, then the bandwidth expense for RTP-Relay will be NT$20 billion/month, so the second embodiment is only used when the first embodiment is failed.
The special features of the improved RTSP according to the present invention are:
The scope of the present invention depends upon the following claims, and is not limited by the above embodiments.
Number | Date | Country | Kind |
---|---|---|---|
101115178 | Apr 2012 | TW | national |