The present invention generally concerns methods and apparatus for use in multi-server web sites and web browsers and more specifically concerns methods and apparatus that improve the perceived responsiveness of multi-server web sites and web environments.
Users seeking content (such as, for example, web pages) from a web site often are confronted by delays. The sought web page may take tens of seconds to load, even if the user has a high-speed connection. The situation may not improve with each successive web page requested from the web site. Each successive web page may similarly load slowly. The perceived lack of responsiveness of the web site may be a source of dissatisfaction for a user. In fact, it might lead the user to conclude that the web site is poorly designed and managed. In view of such a conclusion, the user may seek another web site providing the same information or service.
What users may not understand is the perceived lack of responsiveness of the web site may not be the fault of the web site itself but may actually be caused by other elements of the web that are separate from the web site. For example, servers associated with the Domain Name System (hereinafter “DNS”) of the web often are consulted to service a request for web content. These servers are not part of the web site providing the content, but often provide address information needed to properly address requests to the web site for content. Delays in the DNS system in providing the address information may be mistakenly be attributed to the web site.
In more detail, DNS servers typically have associated address caches where address information for recently visited web sites is stored. For example, when a web client issues a request for a web page maintained by a web site that has not been recently visited, the web client will issue a DNS request to the DNS system for the address information. Depending upon the DNS server used by the client, the DNS request resolution time can vary. For DNS servers that serve a large number of clients, the address for the web site may already be in the cache of a local DNS server. In such a situation, the DNS system simply provides the address information to the web client from the cache of the local DNS server. For a DNS server that services a small number of clients, the server typically will not have the DNS entry within its local cache, resulting in a so-called “DNS cache miss”. When a DNS cache miss occurs, the local DNS server used by the client must make requests to additional DNS servers. These additional DNS requests can result in a long wait period for the web client.
This problem is increased when a web client visits a page that consists of many URLs that each have different hostnames, even if the hostnames are within the same network domain. Each different hostname requires a DNS request and response. If there is a DNS delay, this can give the user the false impression that the web site is at fault when in fact it is the DNS system that is causing the delay.
Similar problems may be encountered when a web page is produced using multiple content elements stored on web servers having different hostnames. If there is a delay in receiving any of the content elements, depending on the browser used, either the entire web page may be delayed in rendering, or the web page may be only partially rendered leaving blanks where content (such as, for example, images; text; dynamic elements) should appear.
Thus, those skilled in the art seek improvements for use in multi-server web sites and web environments that improve the perceived responsiveness of the web sites and web environments to users. In particular, those skilled in the art seek improvements that reduce the likelihood that delays occurring in the DNS system will actually be perceived by a user. DNS cache misses and resulting DNS delays may still occur, but those skilled in the art seek improvements that mask the delays from the user so that the user does not realize that delays are occurring.
The foregoing and other problems are overcome, and other advantages are realized, in accordance with the following embodiments of the invention.
A first embodiment of the invention is a method. In the method, movement from a first web page to a second web page is identified as an expected transition. It is assumed that the first web page is stored on a first server having a first hostname, and the second web page is stored on a second server having a second hostname different from the first hostname. Then, identification information is selected for the second web page that can be used to issue a DNS request for address information needed to service a web request for the second web page. Next, the first web page is associated with identification information for the second web page. Then, the association of the first web page with identification information for the second web page is saved to computer memory.
A second embodiment of the invention is a method occurring in a multi-server web environment. In the second method, a first web request is received from a web client for web content. Then, a response to the first web request is generated, wherein the response comprises the web content sought by the first web request and additional information. The additional information can be used by the web client to issue at least one DNS request for address information that will be needed to service an anticipated second web request. It is expected that the web client will issue the anticipated second web request after the first web request. Next, the response containing the web content sought by the first web request and the additional information is transmitted to the web client.
A third embodiment of the invention is a computer program product comprising a signal bearing medium tangibly embodying a computer readable program executable by digital processing apparatus. The computer readable program, when executed by digital processing apparatus, is configured to receive a first web request from a web client for a first web page; to generate a response to the first web request, wherein the response comprises the web page sought by the first web request and content element identification information identifying additional web content that will be needed to reproduce a second web page; and to transmit the response to the web client.
A fourth embodiment of the invention is a system comprising: a web site incorporating a plurality of web servers, at least first and second web servers of the plurality having URLs with hostnames different from one another; a computer memory coupled to the web site, the computer memory storing a computer program configured to perform operations for managing the web site when executed by digital processing apparatus; and a digital processing apparatus coupled to the web site and the computer memory, the digital processing apparatus configured to execute the computer program. Each of the first and second web servers with hostnames different from one another provides content used to produce a particular web page. When the computer program is executed by the digital processing apparatus, the system is configured to identity that the particular web page is produced using content provided by the first and second servers, wherein the first and second servers have different hostnames; to change the hostname associated with content used in the particular web page provided by the second web server to the hostname of the first web server; and to copy the content with the changed hostname to the first web server.
In conclusion, the foregoing summary of the various embodiments of the present invention is exemplary and non-limiting. For example, one or ordinary skill in the art will understand that one or more aspects or steps from one embodiment can be combined with one or more aspects or steps from another embodiment to create a new embodiment within the scope of the present invention.
The foregoing and other aspects of these teachings are made more evident in the following Detailed Description of the Invention, when read in conjunction with the attached Drawing Figures, wherein:
One embodiment of the invention is practiced in a situation when a web client issues a request for a series of web pages. Ordinarily, before any web page request can be issued the web client must first obtain the DNS entry for the web page server (the IP address of the web server) that contains the web page. Methods and apparatus of this invention anticipate this by including URLs for web pages likely to be requested later in a series of requests for web pages in web pages responsive to initial requests. When the contents of the initial web pages are delivered to the web client, the web pages contain embedded URLs that consist of other hostnames within the website that the web client will likely visit. By including URLs in the first or initial web pages that contains the hostnames that the web client will likely need in order to make further requests for subsequent web pages, the web client can get a head start on resolving hostnames. This process will allow the web client to skip the DNS request/response steps when a subsequent web page is requested from one of the other web servers because the web client will already have saved the DNS entry within its local DNS cache. The included URLs are encoded as “hidden” images or other web objects which allow the initial page to be rendered without having to wait for the hidden images or objects to be downloaded from the site. These hidden images could be one pixel images that appear at the end of the page or off-screen from the page.
By including the hidden images within a web page, this forces any web client to pre-fetch DNS entries which the web client will likely need for a subsequent HTTP request.
In one embodiment, the present invention is a method and apparatus for providing information to a web client so that the web client can pre-resolve DNS entries. Embodiments of the present invention also allow website creators to normalize the network load over a series of web pages by copying embedded URLs contained within the pages. Embodiments of the invention further allow creators of websites to minimize the number of TCP connections that a client must use in order to view a particular web page.
The web site 101 consists of four web servers 103, 105, 107 and 109 that may have the same Internet domain name but may each have a different hostname.
If the local DNS server 123 does not have the address entry for each of the web servers 103, 105, 107, 109 when the web client 111 makes each of the DNS requests 121 there can be a delay as the DNS server 123 obtains the address for the web client 111. The delay can occur for each request for each web server within the web site. This delay is not the fault of the website 101 but rather the DNS server that the web client 111 is using.
In order to improve a user's experience of the website 101 the author of the website 101 can make improvements in accordance with the invention to speed up the resolution of DNS names.
The first step is to create a set of possible transitions from one web page to another. A transition occurs between a pair of web pages where an initial web page of the pair is web page that a user would be expected to view first and the later page is a web page that the user would be expected to view after the user has viewed the initial web page. For web site 101 this would result in a transition set of: (Welcome 139 to Logon 141), (Logon 141 to Authenticate 143), (Authenticate 143 to E-mail 145). Depending on implementation, the transition set can consist of all possible transitions or just the most popular. “Most popular” can be manually chosen or determined from past history of the web site 101.
Using the set of web page transitions, URLs can be added to each web page, where the URLs refer to a future page, or to a future series of web pages. The embedded URLs can be a “hidden” image file that appears out of the normal view of the user. By using a hidden image it will allow the web page to render on the screen while the DNS request 121 and the response 125 are taking place for the hostname that contains the hidden image. There is typically some “think time” when a page is rendered on a screen when the user must view the page. During this think time the DNS request 121 and the response 125 can occur making the user unaware of the delay.
Referring to
In other embodiments, URLs for multiple web pages in a sequence can be included in a web page likely to be requested first. In the example of
The hidden image can be something such as a 1 pixel image or an image that appears off the screen or out of view of the user. In addition other objects could be used instead of an image URL such as a script file. Anything that causes the web client 111 to make a DNS request for the desired hostname could be used.
In alternate embodiments, actual physical address information may be provided. This can be done when it is known that the address of an item is unlikely to change. This totally eliminates in certain circumstances the need for a DNS query. Since addresses for many items can be expected to change, though, it is better in other instances to provide identification information (URLs) so the DNS system can be queried for an up-to-date address.
In addition, the web site creator can use the identified set of web page transitions to normalize the network load over the set of web servers 103, 105, 107, 109 within website 101. This process would occur by moving content that appears on subsequent web pages to previous pages. For example, if an image file is shown on the logon page 141 on web server 105 then a hidden URL can be placed on the welcome page 139 for the web server 103. Doing this results in a web client 111 pre-fetching the image for the logon page 141 before the actual logon page 141 is requested by the web client 111. This image when requested as part of the Welcome page 139 would be hidden from the user. When the HTTP request 115 is made for the Logon page, the web client 111 will already have the image within its cache and will not have to request the image in order to render the Logon page 141.
In a variation of this embodiment where security is a concern, providing content in anticipation of future need can be selectively disabled either by the web site or web client. For example, the web client can signal the web site that address information or content should not be provided in anticipation of future need.
A third improvement provided by embodiments of the invention is obtained by reducing the number of TCP connections needed to render a web page. The number of TCP connections is equal to the number of unique hostnames found within all of the URLs of a web page. Using this technique, if an image file URL within the Welcome page 139 on web server 103 is located on another web server (105, 107, 109) within the web site 101 then that image could be copied from the other web server to web server 103. This process would allow a web client 111 to render a web page with only one TCP connection using multiple HTTP requests over the same TCP connection (Known as HTTP 1.1)
The three improvements just described can be automated by including the methods within a web site authoring application.
In summary,
In one variant of the embodiment depicted in
Typically, the identification information for the second web page comprises a URL. Further, in typical embodiments, the URL comprising the identification information for the second web page is hidden so that when the first web page is displayed the URL for the second web page is not visible.
In another variant of the method depicted in
In a further variant of the method depicted in
Yet another variant of the method depicted in
In a still further variant, the first web server receives a first web request for the first web page from a web client; and the first web server transmits the web page in response to the web request. As indicated previously, the first web page has been amended to incorporate content element identification information identifying a content element needed to reproduce a second web page. Accordingly, upon receipt of the first web page, the web client treats the URLs in the first web page (including the URL corresponding to the content element needed to reproduce the second web page) like any other URL and requests the content corresponding to the URLs. In this manner, the web client issues a request for the content element needed to reproduce the second web page before the web client receives a request for the second web page from the user.
In another variant of the method depicted in
Another embodiment of the aspect of the invention that provides identification information that can be used to issue a DNS request for address information needed to service a web request for a web page before the web request is actually received is depicted in
In a variant of the method depicted in
As indicated, the method depicted in
In another variant of the method depicted in
In the method of
In a variant of the method depicted in
In another variant of the method depicted in
At step 510 of the method, it is identified that a particular web page is produced using content provided by first and second servers, wherein the first and second servers have different hostnames. Then, at step 520, the hostname associated with the content used in the particular web page provided by the second web server is changed to the hostname of the first web server. Next, at step 530, the content with the changed hostname originally stored in the second web server is copied to the first web server. These operations reduce the number of TCP connections needed to reproduce the particular web page by one. If the content needed to produce the particular web page is only provided by the first and second web servers, the method depicted in
One of ordinary skill in the art will understand that methods depicted and described herein can be embodied in a computer program storable in a tangible computer-readable memory medium or signal bearing medium. Instructions embodied in the tangible computer-readable memory or signal-bearing medium perform the steps of the methods when executed. Tangible computer-readable memory media include, but are not limited to, hard drives, CD- or DVD ROM, flash memory storage devices or in a RAM memory of a computer system.
In addition, apparatus associated with a web site such as the one depicted in
Thus it is seen that the foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the best apparatus and methods presently contemplated by the inventors for improving interactions between web sites comprised of servers having different hostnames and web clients. One skilled in the art will appreciate that the various embodiments described herein can be practiced individually; in combination with one or more other embodiments described herein; or in combination with web environments differing from those described herein. Further, one skilled in the art will appreciate that the present invention can be practiced by other than the described embodiments; that these described embodiments are presented for the purposes of illustration and not of limitation; and that the present invention is therefore limited only by the claims which follow.