The present invention generally relates to Voice-over-IP (VoIP) and, more specifically, to a method and system for handling and adapting varying frame rates in audio communications.
VoIP calls transfer audio data across an IP (Internet Protocol) network as audio frames, each frame containing a specific amount of audio data. Currently, there are various standards that can be used to encode audio data in a frame. Some of these standards vary with respect to how much audio data is to be included in a frame. As a result, for equipment supporting different frame sizes to be able to communicate with each other, conversion of frame rates has to be carried out.
For example, phone A may be only capable of handling 20 ms of audio data per frame, while phone B may be only capable of handling 30 ms of audio data per frame. This is a common incompatibility issue when using the voice codec G.711. Because of the incompatibility between the two phones, these two phones cannot transfer audio data directly between them. Even if a VoIP gateway is used to facilitate communications between the two phones, the VoIP gateway has to adapt the different frame rates to match the phones' respective requirements.
Hence, it would be desirable to develop a method and system that is capable of solving the foregoing problem, as well as others, by providing the capability to adapt varying frames rates in audio data communications.
A system for providing frame rate conversion for audio data is provided. In one embodiment, the system includes a first client configured to transmit audio data frames at a first frame rate, a second client configured to receive audio data frames at a second frame rate. The first frame rate is different from the second frame rate. The system further includes a device configured to facilitate transmission of audio data frames between the first client and the second client. The device is further configured to receive the audio data frames from the first client at the first frame rate, store the received audio data frames in an intermediate storage area and repackage the stored audio data frames into one or more frames for transmission to the second client at the second frame rate.
In one implementation, the system is implemented in software, hardware or a combination of both and the system is incorporated into a Voice-over-IP gateway.
Reference to the remaining portions of the specification, including the drawings and claims, will realize other features and advantages of the present invention. Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with respect to accompanying drawings, like reference numbers indicate identical or functionally similar elements.
The present invention in the form of one or more exemplary embodiments will now be described. In one exemplary aspect of the present invention, conversion or translation of varying frame rates can be achieved through the use of local storage of the audio data. Frame rate conversion can be carried out by using intermediate storage for the audio data. Inbound data is added to the intermediate storage. As soon as there is enough data to compose an outbound frame, data is transmitted from the intermediate storage. This way, any inbound frame size can be adapted to any outbound frame size.
For example, assume that client 12a uses a first protocol which allows transmission of a frame having 20 ms of audio data and client 12b uses a second protocol which allows reception of a frame having 100 ms of audio data. Client 12a wishes to transmit certain audio data to client 12b. The frame rate conversion logic 16 is aware of the respective protocols being used by the clients 12a and 12b. Since a frame from client 12a contains less audio data than a frame that can be handled by client 12b, the frame rate conversion logic 16 first stores a number of frames from client 12a in an intermediate storage area until sufficient audio data for a frame is collected for transmission to client 12b. In this example, once five (5) frames from client 12a have been stored, audio data from these frames is transmitted as a single frame to client 12b.
In a reverse scenario where client 12b wishes to send audio data to client 12a, the frame rate conversion logic 16 breaks up a frame from client 12b into smaller frames, in this case, five (5) frames, and transmits these frames to client 12a.
In a more general situation, the frame rate of client 12a does not factor into the frame rate of client 12b as an integer. In that case, once the frame rate conversion logic 16 collects sufficient data from client 12a to form an outbound frame for client 12b, the outbound frame is sent to client 12b and any excess data is stored for use in the next outbound frame. For example, assuming client 12a accommodates a frame having 30 ms of audio data and client 12b accommodates a frame of 40 ms of audio data. Once two (2) frames from client 12a totaling 60 ms of audio data are received, one (1) outbound frame having 40 ms of audio data is sent to client 12b. The remaining 20 ms of audio data is stored in a temporary storage area. When the next frame from client 12a is received, 50 ms of audio data is now available (20 ms from temporary storage area and 30 ms from client 12a frame). Another outbound frame is sent to client 12b. The remaining 10 ms of audio data is then put into the temporary storage area. In this manner, once an outbound frame is filled up, it is sent to client 12b.
It should be understood that frame rate and frame size are merely different ways or metrics used to describe a frame. In addition, as frame size directly corresponds to frame length, this can also be referred to as frame length adaption. An increase in frame rate is the same as a decrease in frame length, i.e., the amount of audio data in each frame.
In one embodiment, the frame rate conversion logic 16 is shown as residing on the VoIP gateway 14. Based on the disclosure and teachings provided herein, it should be understood that the frame rate conversion logic 16 can reside on any type of equipment that serves as an intermediary between two devices transmitting and/or receiving audio data. The present invention can be incorporated or integrated into various components of a computer network. In an alternative embodiment, the present invention is integrated into traffic aggregation equipment, such as, an intelligent bandwidth manager. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will know of various ways and/or methods to deploy the present invention.
In an exemplary implementation, the present invention is implemented using software in the form of control logic, in either an integrated or a modular manner. Alternatively, hardware or a combination of software and hardware can also be used to implement the present invention. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will know of other ways and/or methods to implement the present invention.
While the above description is provided in the context of VoIP, it should be understood that the present invention can be deployed in various applications involving different kinds of data having different data sizes or frame lengths. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will appreciate how to deploy the present invention.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference for all purposes in their entirety.
The present application claims the benefit of priority under 35 U.S.C. §119 from U.S. Provisional Patent Application Ser. No. 60/512,386, entitled “FRAME RATE ADAPTION” filed on Oct. 16, 2003, the disclosure of which is hereby incorporated by reference in its entirety for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
5642357 | Suzuki et al. | Jun 1997 | A |
6052368 | Aybay | Apr 2000 | A |
6141341 | Jones et al. | Oct 2000 | A |
6259695 | Ofek | Jul 2001 | B1 |
6611694 | Oltedal et al. | Aug 2003 | B1 |
6658027 | Kramer et al. | Dec 2003 | B1 |
6693921 | Whitfield | Feb 2004 | B1 |
6721712 | Benyassine et al. | Apr 2004 | B1 |
6735199 | Ofek | May 2004 | B1 |
6847313 | Biswas | Jan 2005 | B2 |
6856613 | Murphy | Feb 2005 | B1 |
6996059 | Tonogai | Feb 2006 | B1 |
7096274 | Ci et al. | Aug 2006 | B1 |
7167451 | Oran | Jan 2007 | B1 |
7444281 | Sundqvist et al. | Oct 2008 | B2 |
20020037002 | Mizusawa et al. | Mar 2002 | A1 |
20020041570 | Ptasinski et al. | Apr 2002 | A1 |
20020114320 | Ogren | Aug 2002 | A1 |
20020114321 | Ogren | Aug 2002 | A1 |
20030152093 | Gupta et al. | Aug 2003 | A1 |
20040156624 | Kent et al. | Aug 2004 | A1 |
20040160979 | Pepin et al. | Aug 2004 | A1 |
20040263363 | Biswas | Dec 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
60512386 | Oct 2003 | US |