The present invention relates to an authentication mechanism for sequenced message streams and a method for performing the authentication mechanism.
As Voice over Internet Protocol (VoIP) deployment becomes more prevalent, Denial of Service (DoS) attacks against them have become a cause of concern. The Real-Time Transport Protocol (RTP) is an Internet protocol standard that specifies a way for programs to manage real-time transmission of multimedia data. RTP is commonly used with Internet telephony. The functioning of VoIP device can be impaired by sending fake RTP packets, which, if spoofed properly, are played back resulting in degraded audio quality. The Secure Real-Time Transport Protocol (SRTP) defines a profile of RTP intended to provide encryption, message authentication and integrity, and replay protection to RTP data such as, for example, Voice over Internet Protocol (VoIP). Authentication is generally performed by attaching an authentication tag to the message to be sent. The tag is calculated using a cryptographically secure hash algorithm, such as HMAC-SHA1. The recipient of the message must recompute the tag for every message received, and verify that it matches the one attached to the message.
The protection afforded by SRTP comes with a cost. The hash algorithms used typically incur a high processing overhead. This provides another avenue for a DoS attack. Simply flooding a telephony device with a fake stream of packets causes the device to consume a lot of CPU cycles on authenticating and rejecting them. Depending on the processing power of the device, it is possible to impair its regular functioning with a rate of fake packet traffic that is significantly lower than the device's network capacity. In some cases, even a reasonably low rate packet flood may impair normal functioning of the device.
Three schemes have been proposed to enhance the ability to tolerate DoS attacks against VoIP devices that implement SRTP. The first of these schemes requires that the sender and receiver negotiate a key and a seed so that they independently generate the same cryptographically secure pseudo random number series. Each packet is associated with a sequence number and the receiver must only compare the random number to authenticate the packet. A disadvantage with this scheme is that packets within a certain window are susceptible to a packet replay attack.
A second scheme for enhancing the ability to tolerate DoS attacks against VoIP devices involves computing a hash over fewer bytes. However, this scheme still requires that a hash be computed by the receiver for every packet received. Therefore, this scheme is computationally expensive.
According to a third scheme, the sender calculates in advance a series of random numbers and uses a number as the authentication tag for each packet. The series of random numbers are sent to the receiver along with the RTP payload. However, a disadvantage of this third scheme is that the size of the RTP packet increases by the additional random sequence of numbers transmitted.
An object of the present invention is to provide a method for authenticating sequenced message streams which overcomes the problems of the prior art.
The object is met by a method of authenticating a communication between a sender and a receiver including computing a first sequence of numbers at the sender and computing a second sequence of numbers at the receiver, wherein the second sequence is the same as the first sequence. Successive values of the first sequence are embedded in successive messages sent by the sender. Upon receiving a message, the receiver compares the embedded value with an expected corresponding value from a list of values to be compared comprising at least one value from the second sequence. The received message is considered to be from an authentic sender if the embedded value matches the expected corresponding value. The matched value is then removed from the list of values to be compared to prevent replay attacks.
Accordingly, the authentication according to the present invention is based on comparisons, which makes it very lightweight. The mechanism may be used with Secure Real-Time Transport Protocol (SRTP).
The first and second sequences of numbers may be generated by a hash algorithm based on a secret key and a seeding hash agreed upon by the sender and receiver. The sequence numbers of the first and second sequences are truncated hashes generated by the hash algorithm.
The messages sent between the sender and receiver may comprise RTP packets of a VoIP media stream. The received messages may be sent to an SRTP layer if the received message is considered to originate from an authentic sender.
The first and second sequences may alternatively be generated using a Pseudo Random Number Generator. In this case, the shared secret may comprise a key used for AES encryption.
The receiver stores a window of N sequence numbers from the second sequence and comparisons of the sequence number of the received packet are made only with the N sequence numbers in the window. If a message with the lowest sequence number of the N sequence number is received, the window slides by one sequence number in the second sequence. After a packet is successfully authenticated, the receiver replaces the sequence number with a number outside the current window, thereby preventing replay attacks.
A loss of synchronization between the receiver and the sender may be determined by determining whether the window lags behind the messages sent by the sender. This may be accomplished by monitoring a lifetime of a current window and determining whether the lifetime of the current window exceeds a predetermined time period.
Upon detecting a loss of synchronization, packets that are not authenticated are passed to an SRTP layer to perform authentication at the SRTP layer. The first packet to be authenticated by the SRTP layer may be used to resynchronize the receiver by setting the window to begin after the first packet to be authenticated by the SRTP layer. Setting the window may comprise computing the sequence of numbers from the sequence number of the last packet authenticated to the first packet to be authenticated by the SRTP layer. Alternatively, setting the window may be effected by including the authentication sequence number of the first packet authenticated by the SRTP layer as part of the SRTP payload.
Detection of a loss of synchronization may alternatively be accomplished by tracking a position of received packets within the window.
Other objects and features of the present invention will become apparent from the following detailed description considered in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are designed solely for purposes of illustration and not as a definition of the limits of the invention, for which reference should be made to the appended claims. It should be further understood that the drawings are not necessarily drawn to scale and that, unless otherwise indicated, they are merely intended to conceptually illustrate the structures and procedures described herein.
In the drawings:
The present invention relates to a method for communicating an RTP media stream between two network endpoints, i.e., a sender and a receiver.
According to a
At steps S12a and S12b, the sender and the receiver each independently compute a sequence of numbers Na and Nb from the shared secret. The sequences are the same such that Na=Nb. The sequence is referred to hereafter as sequence Ns and is cryptographically secure so that no other party can generate Ns without knowledge of the shared secret. Furthermore, an attacker should not be able to generate some or all of the elements of Ns given knowledge of the some of the elements of Ns. More specifically, the would-be attacker should not be able to generate Ns(i+1), if the attacker knows Ns(i). One example is a sequence of hashes generated using a cryptographically secure hash algorithm hash( ) such as HMAC-SHA1, wherein the hash values are computed as follows:
H0=Hash(Hs, K), and
Hi=Hash(Hi−1, K) for i>=1.
The hash algorithm HMAC-SHA1 computes 20 byte hashes. However, the hash values used may be truncated to a smaller length L, such as, for example, by using the first L bits of the 20 byte hash.
The sender embeds successive hashes of the sequence in the successive messages sent to the receiver, step S14. Since the receiver knows which packet to expect next in the stream, the receiver compares the hash of the received packet with the expected hash (which the receiver has already computed), step S16. Once the packet is authenticated by matching the hash of the received packet with the expected hash, step S18, the receiver can replace the expected hash with the next one in the sequence. Once the packet is authenticated, it can be passed onto the SRTP layer.
If the packet received is the lowest one in the window, then the window slides forward in the sequence Ns, step S56. Furthermore, when a packet is successfully authenticated and it is no the lowest one in the window, it is replace by the sequence number after the last packet of the window, step S58. This provides protection against packet replay attacks in which an attacker captures a valid packet, spoofs the client that it captured the packet from and replays the packet over and over again. The expensive process of computing a new hash is performed only when a legitimate packet is received, not with the reception of every packet.
If packet loss occurs, the receiver may lose synchronization with the sender in that the window of acceptable packets may lag behind the packets sent by sender. Accordingly, it is necessary for the receiver to occasionally resynchronize with the sender. Without the resynchronization, legitimate packets could be dropped.
After performing the steps of
Another variation of the man in the middle attack is when the attacker can modify the contents of the packet. The attacker may modify the contents and then send the modified copy or copies onward. If the modified packets reach the recipient first, communication will be disrupted. To obviate this problem, when a packet is received and the hash or other sequence number is verified, the packet is passed onto the. SRTP where the hash is computed over the entire payload, which helps reject modified copies. However, this solution imposes extra overhead which must be weighed against the benefit of protection against man in the middle attacks.
L is an important parameter of our protocol. Since L bits are appended to every packet, we must keep it small to minimize the overhead of transmitting the extra L bits. L also determines the probability of an attacker generating the right hash for a packet. A brute force attacker would simply send a flood of 2L packets, and the hash for one of these packets will match the real one. Of course, the fake packet will be rejected by the SRTP layer as described above. Therefore, the only disadvantage would be the extra overhead of SRTP in ensuring that we are not accepting a fake packet. However, even a reasonably small value of L makes the probability very small, and makes a brute force attack much harder. For example, L=32 would imply that a brute force attacker would have to send 232 packets between the inter-arrival time of legitimate packets (20 ms for the G.711 codec) to incur the extra overhead of SRTP.
According to
The receiver negotiates K and Hs with the sender. The receiver includes a hash_buffer which stores a window of hashes and a pointer HI which points to the beginning of a current window. The receiver first sets the first entry in the hash buffer as the hash seed Hs, fills the hash_buffer with the first N hashes by computing the first N hashes, sets the pointer HI to 0, and sets a last_hash to the hash_buffer[N-1]. For each packet received, the receiver extracts the hash from the packet and sets H to the extracted hash, determines the position in the window which matches H. If the hash H is not matched, I is invalid and the packet is discarded. If authentication with SRTP fails, the packet is discarded.
If the hash matches, the hash_buffer[I] is replaced with hash(last_hash,K). If I was at the beginning of the window, then the window is slid forward by one place.
Although the above examples refer to VoIP media streams, the method according to the present invention is applicable to authentication of any sequenced message stream. It also provides for one-time pads that can be used for lightweight encryption.
Thus, while there have shown and described and pointed out fundamental novel features of the invention as applied to a preferred embodiment thereof, it will be understood that various omissions and substitutions and changes in the form and details of the devices illustrated, and in their operation, may be made by those skilled in the art without departing from the spirit of the invention. For example, it is expressly intended that all combinations of those elements and/or method steps which perform substantially the same function in substantially the same way to achieve the same results are within the scope of the invention. Moreover, it should be recognized that structures and/or elements and/or method steps shown and/or described in connection with any disclosed form or embodiment of the invention may be incorporated in any other disclosed or described or suggested form or embodiment as a general matter of design choice. It is the intention, therefore, to be limited only as indicated by the scope of the claims appended hereto.