The present invention is directed to the field of communications in general and to a method of recovering from a failure in a communications system in particular.
Multimedia communications services over Packet Based Networks (PBN) which may not provide a guaranteed Quality of Service are described in ITU-T Recommendation H.323 (February 1998). The packet based network over which H.323 entities communicate may be point-to-point connections, a single network segment, or an internetwork having multiple segments and possibly complex topologies. ITU-T Recommendation E.164 (May 1997) describes the international public telecommunication numbering plan.
A typical IP telephony system comprises a plurality of hosts interconnected via a backbone network composed of a number of routers to which the plurality of hosts are connected. These hosts are grouped in network “islands” which have high bandwidth available between all hosts in an island. These islands of high bandwidth are interconnected by the “backbone” network comprising a number of links of known but limited bandwidth between pairs of islands. The bandwidth available on a link between two islands will not generally be sufficient to carry all the telephony traffic between those islands which the hosts in the islands could, in theory, generate. Associated with each inter-island link there are therefore a pair of hosts, one at each end of the inter-island link, which perform an Admission Control Function (ACF). When a host in an island wants to use bandwidth on an inter-island link, it must first be granted permission by the local host performing the ACF for that inter-island link. The ACF ensures that the link bandwidth is never over-committed. If the bandwidth would be over-committed by granting permission for more bandwidth use, then permission is denied. In H.323 based networks, the admission control function is contained within the H.323 gatekeeper. The guaranteed Quality Of Service required for correct transmission of telephony traffic through restricted-bandwidth inter-island links thus depends on correct operation of the ACF. To correctly control traffic on the backbone network links, the ACF at both ends of an inter-node link must hold the same information about link usage. In normal operation, this is achieved by synchronisation of the two ACFs achieved by way of inter-host signalling.
If the ACF in a host at one end of a link fails, the media (traffic) carried by that link will continue to flow (i.e. existing calls will continue). The island would normally be provided with a spare ACF that can be brought into service to replace the failed one. This new ACF will assume the role of the failed one, taking over control of its inter-island links. The replacement ACF will have no knowledge of the link resource allocations in effect immediately before the failure of its predecessor. Although it will gradually obtain knowledge of the true allocation state through updates from inter-island signalling generated when existing allocations are released, it is not in a position on coming into service to immediately authorise new allocation requests. The time to recover is directly related to the duration of resource allocations handled by the ACF. In an IP telephony network the allocations of inter-island resources are associated with calls. The ACF will not fully regain control of the inter-island link until all calls in progress at the time of failure have cleared. Due to the potentially long call hold time, the ACF will not be fully in control for an unacceptably long time following failure.
The present invention provides a communications system comprising a plurality of islands, a media path comprising resources for carrying data in a plurality of calls between first and a second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; in which the system also comprises means for detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; in which the system also comprises means for providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.
The present invention further provides a method of managing communications in a communications system comprising a plurality of islands and a media path comprising resources for carrying data in a plurality of calls between first and second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; the method comprising the steps of detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.
Embodiments of the present invention will now be described by way of example with reference to the drawings in which:
The sequence of events followed in setting up and clearing down a successful call from IP telephony terminal H to IP telephony terminal J will now be described with reference to
At the end of the call one or other of the terminals H, J inform their local call control function K,L respectively) that they wish to terminate the call. The local CCF in turn contacts the local ACF, D or E, to free the resources allocated in path M for that call. Termination of the call in islands A, B may be synchronised by communication between the two CCFs K, L.
If ACF E fails while calls are in progress, it is rapidly replaced by replacement ACF G. However, ACF G does not have a copy of the current resource allocation data that was being used by ACF E prior to failing, so when replacement ACF G is called upon to allocate resources it cannot be sure whether the required resources are currently available.
Whenever an existing call is cleared, the replacement ACF G will receive a de-allocation request informing it of any freeing of resources required. In existing protocols the de-allocation request simply refers to the original allocation request without repeating details of the allocation. This means that the replacement ACF G cannot determine how much of the resources to release as calls clear.
Since both ACFs D and E were informed of all calls traversing the inter-island path M, they both have similar resource allocation data. According to the present invention, the replacement ACF G communicates with its peer ACF D at the other island A to retrieve information on the current resource allocations against the inter-island path M.
In existing protocols individual resource allocations (e.g. as used in communications between the CCF and ACF within a island) are only meaningful within that island. Thus it is not possible to simply transfer resource allocation information from one ACF to its peer in another island as that island will be unable to correlate the transferred resource allocation information with calls handled by that island.
As illustrated in
On clearing calls for which the original allocation data in ACF E has been lost, a de-allocation request will be sent to the ACF D still operating in island A. This ACF performs its normal resource de-allocation actions, and in addition, if the allocation data indicates that the call is one of those in force at the time of the request message from the replacement ACF G, then it will send a message to ACF G indicating the quantity of resources that are being de-allocated. ACF G increases its recorded level of unused resource accordingly.
Messages from ACF D to ACF G continue to be sent until all resources that were allocated at the time of the failure of ACF E have been released. At this point both working ACFs D and G will have a complete set of corresponding resource allocation records and the recovery process will have completed. This is not to say that all resources must be free at any one time in order to achieve recovery. On the contrary, any resources freed since receipt of the request message from the replacement ACF G may be re-allocated by recovered ACF G at any time.
Although described above in terms of a replacement ACF, some systems will be able to recover a failed ACF and return it to service within an acceptable time such that replacement is not necessary. Alternatively, a replacement ACF may itself be replaced some time later by the recovered ACF. The present invention also applies to recovered ACFs where information of resource allocation may have been lost, or merely become inaccurate due to changes in resource allocation that occurred whilst the ACF was not functioning.
Number | Date | Country | Kind |
---|---|---|---|
0016289.1 | Jul 2000 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB01/02957 | 7/2/2001 | WO | 00 | 4/10/2003 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO02/05488 | 1/17/2002 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5528513 | Vaitzblit et al. | Jun 1996 | A |
5548710 | Oono et al. | Aug 1996 | A |
5581689 | Slominski et al. | Dec 1996 | A |
5719854 | Choudhury et al. | Feb 1998 | A |
5737747 | Vishlitzky et al. | Apr 1998 | A |
5848128 | Frey | Dec 1998 | A |
6079028 | Ozden et al. | Jun 2000 | A |
6111852 | Leung et al. | Aug 2000 | A |
6674713 | Berg et al. | Jan 2004 | B1 |
6693874 | Shaffer et al. | Feb 2004 | B1 |
6738343 | Shaffer et al. | May 2004 | B1 |
6785223 | Korpi et al. | Aug 2004 | B1 |
6842780 | Frei et al. | Jan 2005 | B1 |
Number | Date | Country |
---|---|---|
2 317 308 | Mar 1998 | GB |
WO 9835471 | Aug 1998 | WO |
Number | Date | Country | |
---|---|---|---|
20030154420 A1 | Aug 2003 | US |