1. Field of the Invention
The present invention relates to network communications that utilize the resolution of domain names to network addresses.
2. Background Art
Networks, such as the Internet, support various forms of communication. For instance, voice over Internet protocol (VoIP) is a general term for a family of transmission technologies for delivery of voice communications over IP networks such as the Internet or other packet-switched networks. For example, using VoIP, users are enabled to make telephone calls over the Internet using communication devices such as IP phones. When using a VoIP application, a user may expect to hear a dial tone as soon as the user picks up the phone, and may expect to be able to make a call without any problems at any time. However, delays in receiving a dial tone, and other issues, do occur with regard to VoIP telephone calls. Such delays may have various causes.
For instance, the domain name system (DNS) is a hierarchical naming system for computers, services, and further resources participating in communications on the Internet. Each communication device that is configured to communicate over the Internet may be identified by a corresponding DNS domain name, which has an associated IP address. A first communication device may desire to perform a VoIP (or other) communication with a second communication device. The first communication device may identify the second communication device by its domain name. The first communication device may transmit a DNS query that includes the domain name to a DNS server to obtain the IP address for the second communication device, to enable communications with the second communication device. However, a failure of the DNS query may cause a significant amount of network bandwidth to be consumed, because the first communication device may repeatedly transmit the DNS query in further attempts to obtain the IP address for the second communication device. If a large number of communication devices are simultaneously attempting DNS queries that are failing, large amounts of network bandwidth may be consumed, and a voice service outage may even occur.
As such, techniques for avoiding network issues with regard to failed DNS queries are desired.
Methods, systems, and apparatuses are described for handling failed DNS queries and non-responsive DNS servers substantially as shown in and/or described herein in connection with at least one of the figures, as set forth more completely in the claims.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention.
The present invention will now be described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
The present specification discloses one or more embodiments that incorporate the features of the invention. The disclosed embodiment(s) merely exemplify the invention. The scope of the invention is not limited to the disclosed embodiment(s). The invention is defined by the claims appended hereto.
References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
Furthermore, it should be understood that spatial descriptions (e.g., “above,” “below,” “up,” “left,” “right,” “down,” “top,” “bottom,” “vertical,” “horizontal,” etc.) used herein are for purposes of illustration only, and that practical implementations of the structures described herein can be spatially arranged in any orientation or manner.
Embodiments of the present invention may be implemented in communication systems to enable network participants to communicate, while reducing network traffic due to DNS queries. For instance,
Communication system 100 may be configured in various ways. For instance, first and second communication devices 102 and 104 may be any type of communication device configured for communications through network 108, including VoIP communications (which may also be referred to as IP telephony, Internet telephony, voice over broadband (VoBB), broadband telephony, etc.), text messaging, web page browsing, etc. Examples of first and second communication devices 102 and 104 include IP phones, desktop computers (e.g., a personal computer, etc.), servers, mobile computing devices (e.g., a cell phone, smart phone, a personal digital assistant (PDA), a laptop computer, a notebook computer, etc.), etc. Network 108 may be any type of communication network, including a local area network (LAN), a wide area network (WAN), a personal area network (PAN), or a combination of communication networks, where domain name-to-address resolution is performed to enable communications. For example, network 108 may be an IP network, such as the Internet or other packet-switched network, configured for delivery of voice communications (VoIP) and/or other types of data (e.g., text messaging, web pages, etc.).
First communication device 102 may communicate with second communication device 104. For instance, a first user at first communication device 102 may desire to initiate a voice (e.g., VoIP) conversation with a second user at second communication device 104, may desire to transmit an instant message (e.g., to the second user at second communication device 104, or may desire to otherwise communicate with the second user at second communication device 104. In another example, a user at first communication device 102 may desire to access a website hosted by second communication device 104, which may be a web server. In each case, first communication device 102 may have a domain name that identifies second communication device 104. First communication device 102 may communicate with DNS server 106 to resolve the domain name to an IP address for second communication device 104, so that first communication device 102 may be enabled to transmit a communication signal to second communication device 104.
For instance, as shown in
As shown in
As described above, communication system 100 may have various configurations, depending on the particular application. For instance, in one embodiment, first and second communication devices 102 and 104 may be computing systems, and communication signal 114 may be an instant message. In another embodiment, first communication device 102 may be a computing device, second communication device 104 may be a web server, and communication signal 114 may be a request for a web page. In still another embodiment, first and second communication devices 102 and 104 may be IP phones, and communication signal 114 may be a phone call from first communication device 102 to second communication device 104.
For instance,
First and second DOCSIS networks 206 and 208 may be any type of DOCSIS network, including DOCSIS HFC (hybrid fiber coaxial) access network. First CMTS 222 provides connectivity between the first DOCSIS network 206 and IP network 210, and second CMTS 224 provides connectivity between second DOCSIS network 208 and IP network 210. As shown in
First and second DOCSIS networks 206 and 208 enable high-speed, reliable, and secure transport between communication devices 202 and 204 (e.g., users/customers) and the cable headends at CMTS 222 and 224, respectively. First and second DOCSIS networks 206 and 208 provide DOCSIS capabilities, including Quality of Service. IP network 210 is an example of network 108 in
First and second communication devices 202 and 204 are examples of first and second communication devices 102 and 104 in
Each MTA 232 is a client device that contains a subscriber-side interface to the subscriber's communication device (e.g., device 204 or 206) and a network-side signaling interface to call control elements in the network. An MTA 232 provides codecs (coder-decoders) and signaling and encapsulation functions for media transport and call signaling. MTAs 232 may be connected to other network elements by the corresponding DOCSIS network (e.g., network 206 or 208). Note that in an embodiment, an EMTA 228 may include an IP address for the corresponding cable modem 230 and an IP address for the corresponding MTA 232.
In the example of
CMTS 222 and CMTS 224 each provide data connectivity and complementary functionality to cable modems 230a and 230b, respectively, over the corresponding DOCSIS network (e.g., network 206 or 208). Each of CMTS 222 and 224 also provides connectivity to a wide area networks (e.g., IP network 210), and may be located at a cable television system head-end or distribution hub. Each of CMTS 222 and 224 is responsible for allocating and scheduling upstream and downstream bandwidth in accordance with MTA requests and QoS authorizations established by a gate controller (included in CMS 212).
CMS 212 provides call control and signaling related services for the MTA, CMTS, and PSTN gateways in system 200. CMS 212 is a trusted network element that resides on the managed IP portion of system 200. With regard to gateway 216, the MGC is a logical signaling management component used to control PSTN Media Gateways. The MGC maintains the call state and controls the overall behavior of gateway 216. The MGC receives and mediates call-signaling information between the IP network 210 and PSTN 218. The MGC maintains and controls the overall call state for calls requiring a PSTN interconnection. The MGC controls the media gateway by instructing it to create, modify, and delete connections that support the media stream over IP network 210. The signaling gateway (SG) provides a signaling interconnection function between the PSTN signaling network (PSTN 218) and IP network 210. The media gateway (MG) terminates the bearer paths and transcodes media between PSTN 218 and IP network 210.
OSS 214 includes business, service, and network management components supporting core business processes. As defined by the ITU TMN framework, the main functional areas for OSS are fault management, performance management, security management, accounting management, and configuration management. As shown in
The KDC is a security server. The DHCP server is a back office network element used during an MTA device provisioning process to allocate IP addresses and other client configuration information. DNS server 226 is an example of DNS server 106 of
Announcement Server (ANS) 220 is a network component that manages and plays informational tones and messages in response to events that occur in IP network 210. ANS 220 includes an announcement controller (ANC) 234 and an announcement player (ANP) 236. ANC 234 initiates and manages all announcement services provided by ANP 236. ANP 236 is a media resource server responsible for receiving and interpreting commands from ANC 234 and for delivering the appropriate announcement(s) to the MTAs.
For further detail regarding communication system 200 of
The communication systems described above may have problems when requests to resolve domain names fail. For example, referring to system 200 of
When resolving a fully qualified domain name (FQDN), a communication device may first access a local short term or temporary cache to determine whether an IP address corresponding to this FQDN was stored there previously (e.g., after the domain name was previously resolved). If the temporary cache does not store the IP address, the communication device may transmit a DNS query to the DNS server. In response, the DNS server may transmit one or more IP addresses and an associated time to live (TTL) (e.g., time out value) for the IP address(es) to the communication device. The communication device may use an IP address received in the response to perform a communication, and may store the IP address(es) in the temporary cache. The IP address may be stored in the temporary cache until the TTL value expires. Use of this temporary cache for storing resolved IP addresses has the benefit of reducing domain name resolution delays due to DNS queries, and may reduce a number of queries transmitted to the DNS server for IP addresses that are used regularly.
A cable company or other entity that operates DOCSIS networks may be referred to as a multi-system operator (MSO). In an embodiment, to address specific MSO requirements for handling DNS server failures, an additional “permanent” or long term cache may be maintained in first communication device 202 that does not expire according to the TTL time out value. IP addresses obtained due to successful DNS queries may be stored in the long term cache as well as the temporary cache. When resolution of a domain name that has previously been resolved is needed, but the TTL for the resolution has expired, and if DNS server 226 fails to provide a valid response to a DNS query, the long term cache may be accessed for the IP address. Because an EMTA typically stores relatively few FQDNs in the long term cache, a loss of DNS server 226 may be tolerated for long periods of time without requiring a prohibitively large long term cache size.
A “back-off and retry” process may be performed by a communication device after an initial failure of a DNS server to respond to a DNS query, to repeatedly retry the DNS query after progressively longer waiting periods. The DNS query may be repeatedly retried in the hopes that a response is eventually received, up to a predetermined number of retries. However, such a process results in delays associated with a need to send multiple requests to the DNS server, and the need to wait for each response for some predefined period of time.
For general network applications (e.g., Internet browsing, e-mail transactions), network delays of short periods of time due to repeated DNS query attempts may not significantly affect application functionality or user experience. For VoIP applications, however, even short delays may be undesirable to users. For example, when a user accesses an IP telephone (e.g., first communication device 202) to make a call, the communication device may attempt to resolve one or more domain names used to enable the call. Depending on the particular call flow, multiple DNS queries may be required. If domain name resolution is delayed, a user will not hear a dial tone for the duration of the resolution process. This delay may be perceived by a user as a service failure and may negatively affect the user experience.
Existing solutions based on the standards that define the DNS backoff and retry behavior (RFC1034, RFC1035, RFC2308) do not meet the stringent requirements of the VoIP Applications. In another example solution to the retry delay, lookups may be performed proactively as each cache TTL is about to expire. In such a procedure, the DNS lookup may be performed asynchronously to the telephony application, and hence may avoid the situation where the look up delay occurs in series. In the event of a DNS failure, the cache TTL could be reset to keep the IP address associated with the failed domain resolution attempt stored in the cache. However, a drawback to such an approach is that it forces DNS traffic to occur from every communication device for each DNS IP address entry periodically at slightly less than the TTL for each DNS entry. Considering that thousands of communication devices may perform this procedure to maintain IP addresses, and depending on the TTL values and number of IP address entries, the combined amount of network traffic may be very large, leading to network delays.
Thus, techniques are desired that enable VoIP applications to minimize delays associated with the domain name resolution procedure.
In embodiments, DNS resolution techniques are provided that are more fault tolerant to potential failures at DNS servers than conventional techniques. Embodiments enable fewer DNS query retries to be performed when a DNS server is non-responsive, to reduce delays and network traffic. In an embodiment, a number of DNS queries is reduced on a continuous basis the longer the DNS server stays non-responsive.
For instance,
TTL-based cache 306, negative cache 308, and long term store cache 310 may be included in one or more storage devices of a communication device, such as a magnetic disc (e.g., in a hard disk drive), an optical disc (e.g., in an optical disk drive), a memory device such as a RAM device, an EPROM device (e.g., a flash memory device), etc., and/or any other suitable type of read/write storage medium.
DNS resolver 302 is configured to determine IP addresses to enable communications. For example, one or more IP addresses may be needed to be determined for one or more domain names to perform a particular communication (e.g., a voice communication, an instant message communication, a web page request, etc.). DNS resolver 206 may access TTL-based cache 306 to determine if an address corresponding to a desired domain name is present. As described above, entries in TTL-based cache 306 have an expiration time. If TTL-based cache 306 does not include an address for the domain name, negative cache 308 may be accessed for a negative entry associated with the domain name. A negative entry in negative cache 308 for the domain name indicates that a previous attempt to obtain the address for the domain name from a DNS server failed. If a negative entry is present in negative cache 308, DNS resolver 204 may access long term store cache 310 for the address corresponding to the domain name, which was previously stored in long term store cache 310 after being obtained from the DNS server. If a negative entry was not present in negative cache 308, a DNS query (that includes the domain name) may be transmitted by DNS resolver 302 to DNS server 106 to obtain the IP address. If the DNS query fails, DNS resolver 302 may retry the DNS query one or more times. If the subsequent DNS queries fail, DNS resolver 204 may access long term store cache 310 for the address corresponding to the domain name.
DNS resolver 302 may be implemented in a communication device in any manner to provide for domain name resolution. For instance,
As shown in
If DNS response 414 does not include the resolved IP address, or if DNS response 414 is not received at all from DNS server 106, DNS resolver 302 may retry DNS query 412 until a resolved IP address is received, or until a predetermined retry count is reached. In such case, DNS resolver 302 may access long term store cache 310 for the IP address associated with the domain name, and may provide the IP address to communication module 404 in resolved IP address 416, to be used to transmit communication signal 418.
Communication device 102 may perform such an IP address resolution process in various ways. For instance,
Note that initially, DNS resolver 302 may operate in a “normal mode,” in which a failure to resolve a domain name with a DNS server has not yet occurred. As described further below, if a DNS server fails to respond to a domain name resolution request, DNS resolver 302 may enter a “failure mode.” If DNS resolver 302 is subsequently able to communicate with the DNS server and/or to resolve the domain name using the DNS server, DNS resolver 302 may transition back to the “normal mode.”
Referring to
In step 504, a TTL-based cache is accessed for an address corresponding to the domain name. For example, as shown in
For illustrative purposes,
Negative cache 308 may include one or more negative entries 704, which each include a domain name. Negative entries 704 may optionally include further information, in embodiments. Each negative entry 704 indicates that a previous attempt to resolve the indicated domain name failed (e.g., an error message or no response to a DNS query was received). In the example of
Long term store cache 310 may include one or more long term entries 706, which each include a domain name and a corresponding IP address that was previously resolved for the domain name. Entries 706 may optionally include further information, in embodiments. In the example of
Referring back to
In step 508, a local negative cache is accessed for a negative entry corresponding to the domain name. For example, referring to
At decision 510, whether a negative entry is present in negative cache is determined. For example, if during the access of step 508, cache access logic module 602 determines that a negative entry is present in negative cache 308 corresponding to the domain name in request 410, operation proceeds to step 512. If cache access logic module 602 determines that a negative entry is not present in negative cache 308 corresponding to the domain name (e.g., the negative entry timed out, or no negative entry was ever present for the domain name), operation proceeds to step 518 (in
In step 512, a local long term store cache is accessed for the address corresponding to the domain name. For example, cache access logic module 602 may access long term store cache 310 for an IP address corresponding to the domain name in request 410. Operation proceeds to step 514.
In step 514, the communication signal is enabled to be transmitted to the second communication device. For example, cache access logic module 602 may provide the IP address accessed in long term store cache 310 to communication module 404 in resolved IP address 416, to enable communication module 404 to transmit communication signal 418 to the communication device having the resolved IP address.
In step 516, a next domain name resolution request is awaited.
Referring to
In decision 520, whether the DNS server responds to the DNS query is determined. If a response from DNS server 106 to the DNS query transmitted in step 518 is not detected by communication module 404 and/or DNS resolver 302, operation proceeds to decision 536. If a response from DNS server 106 is detected by communication module 404 and/or DNS resolver 302, operation proceeds to decision 522.
In decision 522, whether the address or an error message is received from the DNS server in response to the DNS query is determined. For example, referring to
In step 524, the address is stored in the TTL-based cache and in the long term store cache. For example, referring to
In step 526, the communication signal is enabled to be transmitted to the second communication device according to the address. For example, cache access logic module 602 may provide the received IP address to communication module 404 in resolved IP address 416, to enable communication module 404 to transmit communication signal 418 to the communication device having the resolved IP address. Operation proceeds to step 534.
In step 528, a negative entry is stored in the negative cache. For example, because an error message was received from DNS server 106 in response to a DNS query, cache access logic module 602 may store a negative entry (e.g., a negative entry 704) in negative cache 308 to indicate the domain name for the failed DNS query. Operation proceeds to step 530.
In step 530, the long term store cache is accessed for the address corresponding to the domain name. For example, cache access logic module 602 may access long term store cache 310 for an IP address corresponding to the domain name of the failed DNS query. Operation proceeds to step 532.
In step 532, the communication signal is enabled to be transmitted to the second communication device using the address accessed in the long term store cache. For example, cache access logic module 602 may provide the IP address retrieved from long term store cache 310 to communication module 404 in resolved IP address 416, to enable communication module 404 to transmit communication signal 418 to the communication device having the retrieved IP address. Operation proceeds to step 534.
In step 534, a next domain name resolution request is awaited. Furthermore, if the current mode for DNS resolver 302 is failure mode, DNS resolver 302 transitions from failure mode to normal mode.
In decision 536, whether DNS resolver 302 is in normal mode or failure mode is determined. If DNS resolver 302 is in normal mode, operation proceeds to step 538 (
As shown in
The back-off retry procedure of steps 538, 540, 542, 544, and 546 is described for illustrative purposes with respect to
Referring to
In step 540, the DNS query is transmitted to the DNS server. For example, cache access logic module 602 may instruct DNS query generator 604 to retransmit DNS query 412, which includes the request to resolve the unresolved domain name. Communication module 404 receives DNS query 412 from DNS query generator 604, and may retransmit DNS query 412 to DNS server 106 (
In decision 542, whether the address is received from the DNS server is determined. If a response from DNS server 106 to the DNS query transmitted in step 540 is not detected by communication module 404 and/or DNS resolver 302, operation proceeds to decision 544. If a response from DNS server 106 is detected, operation proceeds to step 524 (
In decision 544, whether a maximum number of retry attempts has been performed is determined. For example, in an embodiment, as shown in
In step 546, an increased length of time is waited. For example, in an embodiment, prior to each instance of re-transmitting a DNS query during a particular back-off retry procedure, an increased amount of time may be waited compared to the immediately prior re-transmission. In this manner, network bandwidth may be conserved. As shown in
In step 548, a DNS failure is determined to have occurred, and a failure mode is entered. For example, back-off retry module 606 may indicate to cache access logic module 602 that the back-off and retry algorithm failed to resolve the domain name, and as a result, cache access logic module 602 may indicate the current mode to be failure mode. Operation proceeds to step 550.
In step 550, a negative entry is stored in the negative cache. For example, because the most recent iteration of the back-up and retry procedure failed, cache access logic module 602 may store a negative entry (e.g., a negative entry 704) in negative cache 308 to indicate the domain name that was not resolved. Operation proceeds to step 552.
In step 552, the long term store cache is accessed for the address corresponding to the domain name. For example, cache access logic module 602 may access long term store cache 310 for an IP address corresponding to the domain name that was not resolved. Operation proceeds to step 554.
In step 532, the communication signal is enabled to be transmitted to the second communication device using the address accessed in the long term store cache. For example, cache access logic module 602 may provide the IP address retrieved from long term store cache 310 to communication module 404 in resolved IP address 416, to enable communication module 404 to transmit communication signal 418 to the communication device having the retrieved IP address. Operation proceeds to step 556.
In step 556, a next domain name resolution request is awaited.
In step 560, the retry count value is decreased. To decrease the amount of retry attempts made during subsequent attempts to resolve the domain name where the DNS server continues to be non-responsive, the retry count value is decreased after each cycle of DNS queries. For example, in an embodiment, as shown in
In step 562, a length of time is waited based on a predetermined time out value. For example, as shown in
In step 564, the DNS query is transmitted to the DNS server. For example, cache access logic module 602 may instruct DNS query generator 604 to retransmit DNS query 412, which includes the request to resolve the unresolved domain name. Communication module 404 receives DNS query 412 from DNS query generator 604, and may retransmit DNS query 412 to DNS server 106 (
In decision 566, whether the address is received from the DNS server is determined. If a response from DNS server 106 to the DNS query transmitted in step 564 is not detected by communication module 404 and/or DNS resolver 302, operation proceeds to decision 570. If the address is received from DNS server 106, operation proceeds to step 568. Note that if an error message is received from DNS server 106, in an embodiment, a negative entry may be stored in negative cache 308, long term storage 310 may be accessed for the IP address corresponding to the domain name, the communication signal may be transmitted to the second communication device using the IP address, and a next domain name resolution request may be waited for.
In step 568, the retry count value is reset to a predetermined original value. For example, retry count modifier module 808 may be configured to reset the value of retry count value 806 to its original value in storage 802 (if retry count value 806 was decreased in storage 802), or may be configured to read the original value of retry count value 806 from storage 802 to overwrite a value for retry count value 806 maintained in back-off retry module 606. In this manner, a next time that a back-off retry algorithm is used to transmit DNS queries to the DNS server to resolve a domain name, the back-off retry algorithm will use the original maximum number of DNS queries.
In step 570, the wait time is increased. Similarly to the description further above, in an embodiment, prior to each instance of re-transmitting a DNS query during a particular back-off retry procedure, an increased amount of time may be waited compared to the immediately prior re-transmission. In this manner, network bandwidth may be conserved. As shown in
An example domain name resolution request is described as follows with respect to flowchart 500 of
Referring to step 538, a back-off retry algorithm is begun that includes step 540, decision 542, decision 544, and step 546. In step 538, the time value of wait time timeout value 804 (
In step 548, failure mode is entered. Operation proceeds to step 550, and a negative entry may be stored for the domain name “deviceZ” in negative cache 308. In step 552, and referring to
Subsequently, in step 502 (
Referring to decision 558, a back-off retry algorithm is begun that further includes step 562, step 564, decision 566, and step 570. In decision 558, initially the maximum number of retry attempts (indicated by retry count value 806) may not be reached, and thus operation proceeds to step 562. In step 562, the time value of wait time timeout value 804 (
In step 560, the retry count value is decreased. For example, the value of retry count value 806 may be decremented so that one fewer DNS query iteration is performed during the next back-off retry procedure iteration. Operation proceeds to step 550 (
In an embodiment, if the DNS server continues to be non-responsive during subsequent requests to resolve “deviceZ”, retry count value 806 may be eventually decreased to zero by sufficient repetitions of step 560 (
For instance,
For example, with regard to the DNS query at time point 902a, because no response is received, back-off retry operation 904a is performed (starting at step 538 in
With regard to the DNS query at time point 902b, because no response is received, back-off retry operation 904b is performed (starting at decision 558 in
With regard to the DNS query at time point 902c, because no response is received, back-off retry operation 904c is performed (starting at decision 558 in
With regard to the DNS query at time point 902d, because no response is received, back-off retry operation 904d is performed (starting at decision 558 in
With regard to the DNS query at time point 902e, no response is received. However, because the retry count value is zero, no back-off retry operation is performed. Similarly, with regard to the DNS query at time point 902f, no response is received, and because the retry count value is zero, no back-off retry operation is performed. In a similar fashion, no back-off retry operations will be performed for subsequent failed DNS queries until after a DNS query receives a response from the DNS server that includes a domain name resolution, and the mode transitions from failure mode back to normal mode (e.g., step 534 of
Negative caching using negative cache 308 works in conjunction with long term caching by storing DNS lookup failures for a defined timeout period so that subsequent lookups on a domain name will be taken directly from long term store cache 310 rather than continuously retrying a DNS query. With an entry in negative cache 308, a cumulative timeout period inherent in a rapid succession of failed DNS lookups may be avoided, thereby satisfying timing requirements.
RFC 2308 recommends that for error messages (e.g., NXDOMAIN, NODATA and SERVFAIL), a Start Of Authority (SOA) record TTL can be used as TTL for negative caching. This RFC has a requirement on DNS servers to include the SOA record in their response, but experience has shown that many DNS servers do not follow all the RFC requirements. Thus, in an embodiment, a configured value may be used instead.
DNS RFCs (request for comments) 1034 and 1035 specify a simple algorithm for back-off retry for DNS resolvers, similar to the back-off retry algorithm described above. DNS RFCs 1034 and 1035 state that if DNS servers are non-responsive, a DNS resolver should timeout and retry a limited number of times. Specific timeout values or number of retries are not specified. Current EMTA DNS resolvers uses timeout values of 2 seconds, a number of retries value of 3, and an exponential backoff retry algorithm. The delay can then be estimated as the timeout value multiplied by the retry count. For example, a maximum delay can be as much as
SUM 1:N(Tn), where Tn=2×Tn−1 Equation 1
where
Tn=the timeout value, and
N=the number of retries.
So for a timeout value (Tn) (wait time timeout value 804) of 2 seconds, and a retry count value (N) (retry count value 806) of 3, the delay may be 14 seconds.
Configurability: In embodiments, the following parameters may be made configurable so that the operators can set them depending on the traffic and demands of their networks. For example, in a VoIP application, if the maximum tolerance for receiving a dial tone is 1.5 seconds, then the wait time timeout value can be set to 0.5 seconds and the retry count value may be initialized to 2, totaling 1.5 seconds, before an entry is retrieved from long term store cache 310. The following management information bases (MIBs) are examples that may be used to configure various system parameters.
Timeout value for DNS queries: The following MIB may be used to configure the wait time timeout value used for DSN queries:
Max Retry value for DNS retries: The following MIB may be used to configure the maximum/initial retry count value for DNS queries:
Maximum TTL for Negative DNS RRs: The following MIB may be used to configure the timeout/expiration time value used for negative caching (negative cache 308):
Example advantages: Currently EMTAs have little to no DNS fault tolerance. Referring to
In one improvement, the timeout value is reduced from 2 seconds to 500 milliseconds. This results in reducing the backoff retry procedure from 14 seconds to 2.5 seconds. With this improvement, the backoff retry procedure runs to completion considerably faster, but if there is a failure, the user experience is not changed.
The addition of the permanent cache resolves the problem of the user never receiving a response to a DNS query, but leaves the issue of the long delay of 2.5 seconds before the response. This is because the EMTA needs to resolve the DNS address, and needs to get the latest updated DNS resolution as well as handling DNS servers that respond too slowly or suffer from excessive network traffic. So with current settings, the user may receive a response in around 2.5 seconds, which is fairly noticeable and could be improved.
In a further embodiment, negative caching, as described above, is used to resolve the delay issue. Negative caching saves the result of failed lookups for a fixed period of time (e.g., 5 minutes). On the first attempt, the user will experience the delay of 2.5 seconds which is acceptable, but still a slightly noticeable delay, as the EMTA (e.g., DNS resolver 204) tries to resolve the FQDN going through its backoff retry procedure. Once the EMTA determines that the FQDN resolution has failed due to a DNS server failure, this is cached as a negative cache. Therefore for the next five minutes, the user experiences no delays and has normal voice activity. Every 5 minutes, the user will observe the same short delay followed by 5 minutes of normal response time. This will continue until the DNS server is back online. Further enhancements are described as follows:
Configurability: The timeout, retry count and the negative cache TTL may be configured in any manner, including by SNMP (simple network management protocol). For example, these values may be obtained by accessing an SNMP server across a network. This approach enables the operator to choose the optimum values for their networks. Operators can gather statistics on the DNS server responses, their maximum and minimum delays and set these values accordingly to make the user experience more pleasant. For example, these values may be set according to the MIBs described above, or in another manner.
Adaptive Fault Tolerance: With the adaptive approach, the delays experienced by the user for every 5 minutes are reduced as the DNS server stays non-responsive. EMTA will reduce the number of retries every 5 minutes and the user will experience less delays. After the negative cache has expired as many times as the retry count, the user will experience no delays as the EMTA have adapted to the non-responsive DNS server. For example with timeout of 500 ms and retry count of 2, the user will experience a 2 second delay on the first attempt, a 1 second delay after 5 minutes, and a relatively unnoticeable 500 ms delay for the rest of the time until DNS server 226 comes back to a responsive state.
There can be serious problems when DNS servers fail, especially for Voice applications. This algorithm prevents the network from being flooded with DNS queries as voice signaling tries to connect and get response from signaling servers.
Further example advantages that may be provided by embodiments include continued voice service even in the case of a DNS server outage, reduced network traffic upon failures, and service providers being able to configure the fault tolerance parameters for optimal network traffic.
DNS resolver 302, DNS query generator 604, back-off retry module 606, and retry count modifier module 808 may be implemented in hardware, software, firmware, or any combination thereof. For example, DNS resolver 302, DNS query generator 604, back-off retry module 606, and/or retry count modifier module 808 may be implemented as computer program code configured to be executed in one or more processors. Alternatively, DNS resolver 302, DNS query generator 604, back-off retry module 606, and/or retry count modifier module 808 may be implemented as hardware logic/electrical circuitry.
The embodiments described herein, including systems, methods/processes, and/or apparatuses, may be implemented using well known servers/computers, such as a computer 1000 shown in
Computer 1000 can be any commercially available and well known computer capable of performing the functions described herein, such as computers available from International Business Machines, Apple, Sun, HP, Dell, Cray, etc. Computer 1000 may be any type of computer, including a desktop computer, a server, etc.
Computer 1000 includes one or more processors (also called central processing units, or CPUs), such as a processor 1004. Processor 1004 is connected to a communication infrastructure 1002, such as a communication bus. In some embodiments, processor 1004 can simultaneously operate multiple computing threads.
Computer 1000 also includes a primary or main memory 1006, such as random access memory (RAM). Main memory 1006 has stored therein control logic 1028A (computer software), and data.
Computer 1000 also includes one or more secondary storage devices 1010. Secondary storage devices 1010 include, for example, a hard disk drive 1012 and/or a removable storage device or drive 1014, as well as other types of storage devices, such as memory cards and memory sticks. For instance, computer 1000 may include an industry standard interface, such a universal serial bus (USB) interface for interfacing with devices such as a memory stick. Removable storage drive 1014 represents a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup, etc.
Removable storage drive 1014 interacts with a removable storage unit 1016. Removable storage unit 1016 includes a computer useable or readable storage medium 1024 having stored therein computer software 1028B (control logic) and/or data. Removable storage unit 1016 represents a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, or any other computer data storage device. Removable storage drive 1014 reads from and/or writes to removable storage unit 1016 in a well known manner.
Computer 1000 also includes input/output/display devices 1022, such as monitors, keyboards, pointing devices, etc.
Computer 1000 further includes a communication or network interface 1018. Communication interface 1018 enables the computer 1000 to communicate with remote devices. For example, communication interface 1018 allows computer 1000 to communicate over communication networks or mediums 1042 (representing a form of a computer useable or readable medium), such as LANs, WANs, the Internet, etc. Network interface 1018 may interface with remote sites or networks via wired or wireless connections.
Control logic 1028C may be transmitted to and from computer 1000 via the communication medium 1042.
Any apparatus or manufacture comprising a computer useable or readable medium having control logic (software) stored therein is referred to herein as a computer program product or program storage device. This includes, but is not limited to, computer 1000, main memory 1006, secondary storage devices 1010, and removable storage unit 1016. Such computer program products, having control logic stored therein that, when executed by one or more data processing devices, cause such data processing devices to operate as described herein, represent embodiments of the invention.
Devices in which embodiments may be implemented may include storage, such as storage drives, memory devices, and further types of computer-readable media. Examples of such computer-readable storage media include a hard disk, a removable magnetic disk, a removable optical disk, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROM), and the like. As used herein, the terms “computer program medium” and “computer-readable medium” are used to generally refer to the hard disk associated with a hard disk drive, a removable magnetic disk, a removable optical disk (e.g., CDROMs, DVDs, etc.), zip disks, tapes, magnetic storage devices, MEMS (micro-electromechanical systems) storage, nanotechnology-based storage devices, as well as other media such as flash memory cards, digital video discs, RAM devices, ROM devices, and the like. Such computer-readable storage media may store program modules that include computer program logic for DNS resolver 302, DNS query generator 604, back-off retry module 606, and retry count modifier module 808, and/or flowchart 500 (including any one or more steps of flowchart 500), and/or further embodiments of the present invention described herein. Embodiments of the invention are directed to computer program products comprising such logic (e.g., in the form of program code or software) stored on any computer useable medium. Such program code, when executed in one or more processors, causes a device to operate as described herein.
The invention can work with software, hardware, and/or operating system implementations other than those described herein. Any software, hardware, and operating system implementations suitable for performing the functions described herein can be used.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents
This application claims the benefit of U.S. Provisional Application No. 61/219,901, filed on Jun. 24, 2009, which is incorporated by reference herein in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
6701363 | Chiu et al. | Mar 2004 | B1 |
7130922 | Barrow | Oct 2006 | B1 |
7552237 | Cernohous et al. | Jun 2009 | B2 |
7565423 | Fredricksen | Jul 2009 | B1 |
7567582 | Westhead et al. | Jul 2009 | B2 |
7720936 | Plamondon | May 2010 | B2 |
7853721 | Awadallah et al. | Dec 2010 | B2 |
20020042821 | Muret et al. | Apr 2002 | A1 |
20040039798 | Hotz et al. | Feb 2004 | A1 |
20040073707 | Dillon | Apr 2004 | A1 |
20040078487 | Cernohous et al. | Apr 2004 | A1 |
20050169169 | Gadde | Aug 2005 | A1 |
20050204039 | Douglis et al. | Sep 2005 | A1 |
20060031394 | Tazuma | Feb 2006 | A1 |
20060242227 | Rao et al. | Oct 2006 | A1 |
20070041393 | Westhead et al. | Feb 2007 | A1 |
20080228899 | Plamondon | Sep 2008 | A1 |
20090157889 | Treuhaft | Jun 2009 | A1 |
20090210872 | Dai et al. | Aug 2009 | A1 |
20100153969 | Dyba et al. | Jun 2010 | A1 |
20100217837 | Ansari et al. | Aug 2010 | A1 |
20100232592 | Ku | Sep 2010 | A1 |
20100274970 | Treuhaft et al. | Oct 2010 | A1 |
20100306535 | Jain et al. | Dec 2010 | A1 |
20130079022 | Ku | Mar 2013 | A1 |
20130266132 | Ku | Oct 2013 | A1 |
Entry |
---|
Andrews, M., “Negative Caching of DNS queries (DNS NCACHE)”, RFC 2308, Mar. 1998. |
Dilley et al., “Globally Distributed Content Delivery”, IEEE Internet Computing, pp. 50-58, Oct. 2002. |
Ballani, Hitesh, and Paul Francis. “A simple approach to DNS DoS mitigation.” Irvine Is Burning (2006): pp. 67-72. |
Cohen, Edith, and Haim Kaplan. “Proactive caching of DNS records: Addressing a performance bottleneck.” Computer Networks 41.6 (2003): 707-726. |
Ballani, Hitesh, and Paul Francis. “Mitigating DNS dos attacks.” Proceedings of the 15th ACM conference on Computer and communications security. ACM, 2008. |
“Architecture Framework Technical Report”, PacketCable™ 1.5, PKT-TR-ARCH1.5-V02-070412, Cable Television Laboratories, Inc., (Apr. 12, 2007), 58 pages. |
Number | Date | Country | |
---|---|---|---|
20100332680 A1 | Dec 2010 | US |
Number | Date | Country | |
---|---|---|---|
61219901 | Jun 2009 | US |