Search engines implement webcrawlers to fetch web pages for indexing by the search engines. The search engines index pages in order to determine which pages, if any, satisfy search criteria, such as search words. The webcrawlers can be directed to a particular web page in a number of ways. For example, a webcrawler can be directed to the particular web page automatically after starting at an origination page and traversing a set of web pages that are linked to each other and ultimately to the origination page. Alternatively, the webcrawler can receive a web page, e.g., that is sent by a user, for example an owner of the particular web page.
However, some web pages are dynamically generated, for example, in response to a user's request for the page. Such dynamic web pages are often used in systems that provide for interaction with data subject to frequent change. For example, a web shop for the sale of products provide for presentation of and interaction with catalog data, such as identity of products for sale, number of products for sale, price of the products, etc. Such data is subject to frequent change. The products that are available for purchase change. The price of the products frequently changes. Product features change. Therefore, sellers constantly update a file or database of such catalog data. A purchaser requests dynamic pages for interaction with the catalog data, for example to browse and purchase catalog products.
Many webcrawlers do not traverse or otherwise accept a dynamic page for indexing. For example, such webcrawlers identify a page as dynamic by special characters in the Uniform Resource Locator (URL), and do not fetch such identified pages for indexing.
In order to provide access to product catalog data via search engines, it is conventional to generate static pages that correspond to the catalog data. The static pages are not generated in response to a request for the page. They are generated prior to the page request and are retrieved from a memory location in response to the page request. The static pages, rather than dynamic pages, are indexed by search engines. However, the static pages are used instead of the dynamic pages, and legacy systems for which the static pages are generated are therefore significantly changed so that they refer to the static pages, rather than to the dynamic pages. For example, a legacy web shop which refers to dynamic pages for products that are on sale, is changed to refer to the static pages.
Furthermore, systems for the interaction with frequently changing data, e.g., web shops, often open interactive sessions, e.g., a shopping session. Such sessions are opened by generating a session object. The generation and maintenance of the session object uses processor time and uses memory space. For example, a basket is maintained in which the processor keeps track of the products a user selects for purchase. After indexing of the static pages, the search engines can return to a user a link to such a static page in response to a user search. Often, a user selects such a returned link because the link seems to be of interest. Once a page is returned in response to the selection, the user realizes that the page is in fact not of interest. However, as soon as the static page is returned in response to the selection of the link, processor time and memory is unnecessarily used to generate the session object.
Accordingly, there is a need in the art for a system and method for enabling the indexing of pages of dynamic-page based systems without requiring significant change to legacy systems, and for more efficiently using processor time and memory.
Embodiments of the present invention relate to a computer system and method that may generate a corresponding static page, e.g., a Hyper Text Markup Language (HTML) page, for each group of data for which a dynamic page may be generated, e.g., for each product of a catalog database, without requiring significant change in legacy application environments in which the dynamic pages are generated, and that may conserve processor time.
After an update to such data, e.g., catalog data, a processor may generate a static web page associated with the updated catalog data. A dynamic page that corresponds to the generated static page may be generated and returned to a user in response to a page request even after generation of the static page. For example, a link may be provided to a dynamic page for selection by a user during an interactive session, e.g., a shopping session. Accordingly, references by a legacy application environment to dynamic pages, such as links to the dynamic pages, may be maintained.
The processor may maintain an index web page. The index page may include links to each generated static page, and may be submitted to a webcrawler. After generation of a static page, the processor may insert a link to the static page in the index page and may omit submitting the generated static page to the webcrawler. The webcrawler may traverse the generated static page since the static page is linked to the index page submitted to the webcrawler.
In one embodiment of the present invention, the processor may insert code into the static pages or a dynamic page ID into their URLs for redirection to a web shop for opening a web shop session and for generating and displaying a dynamic page instead of the static page.
In an alternative embodiment of the present invention, the processor may configure generated static pages so that an interactive session is not opened when a user accesses the static pages. On the other hand, the processor may configure dynamic pages so that the interactive session is opened when the user accesses the dynamic pages.
In an embodiment of the present invention, the processor 105 may generate an index page 115 that includes for each generated static page 110 a corresponding entry. For example, the processor 105 may generate the index page 115 immediately following generation of the first static page 110, and may insert an index entry immediately following each static page generation. Alternatively, the processor 115 may generate all of the static pages 110, and may subsequently generate the index page 115 and insert entries for all of the generated static pages 110. Each index page entry may include a page address link, e.g., a URL, for its corresponding static page 110. In one embodiment, to keep the size of the index page 115 from becoming very large, a number of index pages 115 may be generated. Each of the index pages 115 may include links to different static pages 110. For example, different index pages 115 may be generated for different categories of products.
After the processor 105 inserts into the index page 115 an entry for each of the generated static pages 110, the processor 105 may transmit the index page 115 towards a webcrawler of a search engine 120. Once the webcrawler receives the index page 115, the webcrawler may traverse the links in the index page 115 that point to the generated static pages 110. The webcrawler may therefore enable the search engine 120 to index each of the static pages 110, even without the transmission by the processor 105 of the static pages 110 to the webcrawler when the static pages 110 are generated. The catalog data for which dynamic pages may be generated may therefore be indexed by the search engine 120.
In an embodiment of the present invention, when a change is made to the catalog data 100, the processor 105 may accordingly update the static pages 110 and the entries of the index page 115. Particularly with respect to a web shop, an update to the catalog data 100 may include, e.g., a change to a status of a product that was previously for sale but is no longer for sale, an addition of a new product put up for sale, or a change to information related to a product that is for sale, such as price or quantity. When a product is added, the processor 105 may generate a new static page 110 and insert a corresponding new entry into the index page 115. When the status of a product is changed so that it is no longer for sale, the processor 105 may remove from the index page 115 the product's corresponding entry. When information related to a product is changed, the processor 105 may remove from the index page 115 the entry corresponding to the previously generated static page 110, may generate a new static page 110, and may insert into the index page 115 a new entry corresponding to the new static page 110.
In one embodiment of the present invention, the processor 105 may periodically check the catalog data 100 for changes. Alternatively, when a change is made to the catalog data 100, the processor 105 may be automatically alerted of the change. In response to the alert, the processor 105 may update the static pages 110 and the index page 115. Alternatively, a user may instruct the processor 105 to update the static pages 110 and the index page 115. In response, the processor 105 may check the catalog data 100 to determine which changes, if any, have been made. If any changes have been made, the processor 105 may accordingly update the static pages 110 and the index page 115.
In one embodiment of the present invention, after the processor 105 makes all of the changes corresponding to the changes to the catalog data 100, the processor 105 may resubmit the index page 115 to the webcrawler so that the webcrawler may re-traverse the links of the index page 115, note which pages have been removed, and provide the new pages to the search engine 120 for indexing.
In an alternative embodiment, the processor 105 may omit resubmitting the index page 115 to the webcrawler since conventional webcrawlers periodically re-traverse already indexed pages to ensure that the index of the search engine 120 is up-to-date. Therefore, even if the processor 105 does not resubmit the index page 115 to the webcrawler after updating the index page 115 and the static pages 110, the webcrawler would eventually obtain the index page 115. The index of the search engine 120 would accordingly be updated.
In an embodiment of the present invention, a user at a terminal 125 may transmit towards the search engine 120 a request to conduct a search for web pages according to input search criteria, e.g., search words. In response, the search engine 120 may access its index of pages to determine which, if any, of the indexed pages satisfy the search criteria. If any of the static pages 110 satisfies the search criteria, e.g., pertains to the search words, the search engine 120 may return to the terminal 125 a link to the “matching” static page 110. If the user selects the link to the static page 110, a server 130 may retrieve the static page 110 from the memory 1, and may transmit the static page 110 to the terminal 125. It will be appreciated that the processor 105 may be located at the server 130 and perform processes of the server 130. Alternatively, the server 130 may include its own processor to perform processes of the server 130.
The search engine 130 may return to the terminal 125 a link to the index page 115. The user at the terminal 125 may accordingly select directly in the index page 115 a link to a static page 110. Accordingly, the processor 105 may insert into the index page 115 data other than links to static pages 110. For example, the index page 115 may include a description of the content of the index page 115, so that the user at the terminal 125 may informatively select links in the index page 115 to static pages 110.
In an embodiment of the present invention, the user at the terminal 125 may directly enter a URL of a static page 110 in order to request the static page 110. In response, the server 130 may retrieve the requested static page 110 from the memory 1, and may transmit the static page 110 towards the terminal 125. The user may also directly enter a URL in order to request a dynamic page 135. Although dynamic pages do not exist until they are requested, the dotted lines in
In one embodiment of the present invention, the user at the terminal 125 may also request a dynamic page 135 by selecting in a returned static page 110, a link to other pages. Since the dynamic pages 135 that are available for retrieval may change over time, the static pages 110, including links to dynamic pages 135 may become outdated. Accordingly, in one embodiment of the present invention, the static pages 110 may be regenerated, e.g., periodically.
Alternatively, the processor 105 may insert into each static page 110 it generates, a link to a master dynamic page 135, which may be constantly maintained. After the user obtains the static page 110 at the terminal 125, the user may select the link to the master dynamic page 135. In response, the server 130 may dynamically generate and transmit the master dynamic page 135, which may include links to other dynamic pages 135 that correspond to the generated static pages 110. Since the master dynamic page 135 is dynamically generated, it may reflect changes in the availability of individual dynamic pages 135.
Alternatively, instead of links to particular dynamic pages 135, the processor 105 may insert into the static pages 110 a link to the server 130. In response to the selection of the link to the server 130, the server 130 may generate a dynamic page 135 that includes links to other dynamic pages 135 that correspond to the static pages 110.
In an alternative embodiment, the processor 105 may generate the static pages 110 without links to other dynamic pages 135. Instead, in response to a request for a static page 110, the server 130 may retrieve the requested static page 110 from the memory 1, and may update the static page 110 with links to dynamic pages 135 before transmitting the static page 110 towards the terminal 125.
In an alternative embodiment, when the server 130 receives a request for a static page 110, the server 130 may transmit towards the terminal 125 a dynamic page 135 that corresponds to the requested static page 110, instead of the requested static page 110. The dynamic page 135 transmitted towards the terminal 125 may include links to other dynamic pages 135.
In response to a search request by a user, the search engine 120 may return links to web pages, including static pages 110, that satisfy the search criteria entered by the user, but that are not ultimately of interest to the user. The user may select one of the returned links to a static page 110. After the user views contents of the static page 110, the user may determine that the static page 110 is not of interest and may refrain from interacting with the page, e.g., by selecting any links in the static page 110. In an embodiment of the present invention, the server 130 may selectively open an interactive session, e.g., a shopping session, only in response to a request for a dynamic page 135, and not in response to a request for a static page 110. Accordingly, when the user selects a link to a static page 110 returned by the search engine 120, the server would not open the interactive session. However, if the user interacts with the returned static page 110, e.g., selects a link in the static page 110 or indicates a desire to order a product, the server 130 may then open an interactive session, e.g., a session in which the user may enter the order. To open the interactive session, the server 130 may create a new session object that may be maintained until the session is terminated.
In 210, the user at terminal 125 may select the link to the static page 110. Based on the URL of the selected link, a request for the static page 110 may be transmitted by the terminal 125 towards the server 130. In one example embodiment of the present invention, in 215, the server 130 may transmit the static page 110 towards the terminal 125 in response to the page request. However, at this time, the server 130 may refrain from opening an interactive session, e.g., web shop interactive session.
In 220, the user may enter the dynamic shop by selecting a link in the returned static page 110, or by interacting with the static page 110 in some other way. For example, in response to an interaction with the static page 110, such as a selection of a link, the terminal 125 may transmit towards the server 130 a request for generation and transmission of a dynamic page 135 for a particular product or a general request to open an interactive session. In response, the server 130 may open an interactive session for the terminal 125. After 220, the server 130 may, in 225, transmit dynamic pages 135 in response to all page requests made by the terminal 125, e.g., as long as the interactive session is maintained.
Those skilled in the art can appreciate from the foregoing description that the present invention can be implemented in a variety of forms. Therefore, while the embodiments of this invention have been described in connection with particular examples thereof, the true scope of the embodiments of the invention should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims.
This application claims the benefit of provisional application Ser. No. 60/586,275, filed Jul. 9, 2004 and of provisional application Ser. No. 60/586,730, filed Jul. 12, 2004.
Number | Date | Country | |
---|---|---|---|
60586275 | Jul 2004 | US | |
60586730 | Jul 2004 | US |