1. Field of the Invention
This invention relates generally to graphical user interfaces (GUIs). More specifically, this invention relates to an apparatus and method for graphically displaying results of a search conducted on an information network such as the Internet, local and remote databases of content providers, etc.
2. Description of the Related Art
A significant development in computer networking is the Internet, which is a sophisticated worldwide network of computer systems. A user that wishes to access the Internet typically does so using a software program known as a web browser that is hosted on a personal computer or other data processing device that is capable of executing the web browser program and being connected to the Internet. A web browser uses a standardized interface protocol, such as HyperText Transfer Protocol (HTTP), to make a connection via the Internet to other computers known as web servers, to receive user commands to operate certain browser functions and/or to request information from the Internet, and to receive information from the web servers that is presented to the user, typically on a display device such as a monitor.
An ever-increasing amount of information is available on the Internet and other information databases (collectively referred to as information networks). A query to an information network requires a textual specification based on keywords and logical operators between keywords. In most instances, the query returns only the results, which may not be very useful when the number of results returned is much larger than that which can be viewed and manipulated on a screen.
When performing a search, it is typical that a search strategy will be used in order to find the desired information. Most search strategies are premised on attaining a reasonable number of items that satisfy a search criteria. Typically, a query is comprised of keywords (i.e., search terms) connected together via logical and/or proximity operators. Logical operators are used to include or exclude items in a set whereas proximity operators are used to identify items having keywords that are a predetermined distance apart, such as within 10 words, in the same sentence, or that are adjacent. Once a query is made and executed, a list of items satisfying the criteria of the query is presented to the user. The user can then either view one or more items in the list, or if the list is large, modify the search to reduce the number of items in the list.
Data navigation is the process of viewing different dimensions, slices, and levels of detail of a multidimensional database. In a typical list of search results from an information network, documents or other items are listed in descending order based on a relevancy value. The relevancy value for each document is based the number of times the keywords are found in the document. A user must still sort through the list sequentially to view other characteristics of the documents, such as size and date, which may also help determine a document's relevancy. Thus it is desirable to provide a data navigation tool which allows the user to view, sort, and navigate search results according to several different data and relevant characteristics.
One technique for sorting lists is known as data clustering, which is the process of dividing a data set into mutually exclusive groups such that the members of each group are as “close” as possible to one another, and different groups are as “far” as possible from one another, where distance is measured with respect to all available variables. There are several models for data clustering, e.g., K-means clustering, self-organizing feature maps, the neural gas algorithm, and complexity optimized vector quantization.
In the K-means procedure, for example, suppose a set of feature vectors x1, x2, . . . , xn are from the same class or subset, and that they fall into k compact clusters, k<n. Let m; be the mean of the vectors in cluster i. If the clusters are well separated, a minimum-distance classifier can be used to separate them. That is, s is in cluster i if ∥x-mi∥ is the minimum of all the k distances. Thus, the k-means procedure partitions the n examples into k clusters so as to minimize the sum of the squared distances to the cluster centers. The results depend on the value of k, which can be any value from 2 to n. When k=n, the procedure is known as the nearest neighbor classifier.
A method and apparatus for representing and navigating search results from a database on a computer system. A graphical user interface is generated to receive user input including a data source to search for information, and a query indicating information which is desired from the data source. The user input is transmitted to the data source, the search is performed and information responsive to the query resulting from the search is received from the data source. The search results include characteristics of the responsive information. The responsive information is clustered into a plurality of groups based on selected characteristic information and means are provided to allow the user to select at least one group of the responsive information to be displayed.
The responsive information includes a list of documents containing information related to the query. The graphical user interface includes a first display portion showing the plurality of groups of characteristic information available for the user to select, and a second display portion showing the list of documents in the responsive information.
In one embodiment, when the user selects one or more groups, the documents displayed in the second display portion belong to the group(s) selected by the user. When a group is selected, it is separated into a plurality of subgroups based on the range of the characteristic information for the selected group. The first display portion is updated to show the plurality of subgroups.
In another embodiment, each group is separated into a plurality of subgroups based on the range of the characteristic information for each group. The first display portion shows the plurality of subgroups, which may be color coded to differentiate the subgroups. Similarly, the list of documents in the second display portion may be correspondingly color coded to the color code in the first display portion.
In another embodiment, a server may be used to transmit data between the client computer system and the data source. In this configuration the server includes program instructions for separating the documents into the plurality of groups based on selected characteristic information.
In another embodiment of the present invention, additional information may be displayed based on the group of responsive information selected by the user.
In another embodiment of the present invention, the first display portion includes a stratum showing the subgroups of the documents. When the user selects one or more subgroups, another stratum showing the subgroup of the responsive information is displayed. The responsive information in the second display portion is based on the subgroup selected by the user.
Another feature of the present invention allows the user to select a document to be displayed for the user to examine its contents.
Another feature of the present invention allows the user to re-arrange the order in which the list of documents in the second display portion are displayed.
The foregoing has outlined rather broadly the objects, features, and technical advantages of the present invention so that the detailed description of the invention that follows may be better understood.
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference symbols in different drawings indicates similar or identical items.
The method and apparatus of the present invention is applicable to devices that access a computerized information network. A number of different information networks are available that allow access to information contained on their computers, with the Internet being one that is generally known to the public. While the Internet is used herein as an example of how the present invention is utilized, it is important to recognize that the present invention is also applicable to other information networks and information systems including Intranets, database management systems, and document retrieval systems. For those who are not familiar with the Internet, the world-wide web, web servers, and web browsers, a brief overview of these concepts is presented here.
An example of a typical Internet connection found in the prior art is shown in
The web servers 118, 120, 122, 124 execute a web server application program which monitors requests, services requests for the information on that particular web server, and transmits the information to the user's workstation 112. A web page is primarily visual data that is intended to be displayed on the display monitor of the user's workstation 112. When web server 118 receives a web page request, it will transmit a document, generally written in a markup language such as hypertext markup language (HTML), across communication link 116 to the requesting web browser 114. When web server 118 receives a search request, the request is sent to the server containing the search engine specified by the user. The search engine then compiles one or more pages containing a list of links to web pages on other web browsers 120, 122, 124 that may contain information relevant to the user's request. The search engine transmits the page(s) in markup language back to the requesting web server. Web browser 114 interprets the markup language and outputs the web page to the monitor of user workstation 112. This web page displayed on the user's display may contain text, graphics, and links (which are addresses of other web pages.) These other web pages (i.e., those represented by links) may be on the same or on different web servers 116. The user can go to these other web pages by clicking on the links using a mouse or other pointing device. This entire system of web pages with links to other web pages on other servers across the world comprises the world wide web.
Workstation 112 and/or web servers 116 are computer systems, such as computer system 130 as shown in
The peripheral devices usually communicate with processor 132 over one or more buses 134, 156, 158, with the buses communicating with each other through the use of one or more bridges 160, 162. Computer system 130 may be one of many workstations or servers connected to a network such as a local area network (LAN), a wide area network (WAN), or a global information network such as the Internet through network interface 140.
CPU 132 can be constructed from one or more microprocessors and/or integrated circuits. Main memory 136 stores programs and data that CPU 132 may access. When computer system 130 starts up, an operating system program is loaded into main memory 136. The operating system manages the resources of computer system 130, such as CPU 132, audio controller 142, storage device controller 138, network interface 140, I/O controllers 146, and host bus 134. The operating system reads one or more configuration files to determine the hardware and software resources connected to computer system 130.
During operation, main memory 136 includes the operating system, configuration file, and one or more application programs with related program data. Application programs can run with program data as input, and output their results as program data in main memory 136 or to one or more mass storage devices through a memory controller (not shown) and storage device controller 138. CPU 132 executes many application programs, including one or more programs to establish a connection to a computer network through network interface 140. The application programs may be embodied in one executable module or may be a collection of routines that are executed as required.
Storage device controller 138 allows computer system 130 to retrieve and store data from mass storage devices such as magnetic disks (hard disks, diskettes), and optical disks (DVD and CD-ROM). The mass storage devices are commonly known as Direct Access Storage Devices (DASD), and act as a permanent store of information. The information from the DASD can be in many forms including application programs and program data. Data retrieved through storage device controller 138 is usually placed in main memory 136 where CPU 132 can process it.
One skilled in the art will recognize that the foregoing components and devices are used as examples for sake of conceptual clarity and that various configuration modifications are common. For example, audio controller 142 is connected to PCI bus 156 in
The present invention is designed to provide the user with more information regarding the results of a search and to allow the user to navigate through the information to facilitate finding the most relevant documents. In one embodiment shown in
The remaining portion of flowchart 400 pertains to another feature of the present invention namely, a graphical user interface (GUI) for selecting options and viewing the documents in different groups, or classes, according to selected display criteria.
A user then selects a source of information in data source window 504 by either typing in the name of the source directly or selecting an entry in a pull-down menu that is accessed by selecting arrow 506. To send the contents of query window 502 and data source window 504 to server 118, the user selects search button 508. Once server program instructions 304 compile and format the results of the search, they are sent to client program instructions 302. The results of the search include categories of information such as, for example, the size of each document, the rank of the documents as determined by the search engine, the date that each document was posted on the information network, the language each document is written in, the URL of each document, and the cluster in which each document is grouped as determined by the clustering algorithm utilized with the present invention, such as a K-means clustering algorithm. The type of information available is based on the categories of information available from data sources 312 through 322. For example, a data source for a sales catalog may include a number of different categories of information including, but not limited to, products, price, discount, product availability, sizes, colors, and other physical properties. Another example is a stock market data source that may include information including, but not limited to, number of shares outstanding, price per share, earnings per share, trading volume, and insider trading. The present invention is designed to be used with virtually any categories of information that are available from a data source. The information that is returned in response to a user's query is typically a list of data records for items such as, for example, documents, stocks, or products. For convenience of notation, the word “documents” is used herein to refer to the data records that are returned in response to the user's query.
Referring to process 416 in
Subsequently, in process 418, the client program instructions 302 generate and display a graphical user interface to allow the user to view and navigate the various categories of information available. For example, GUI 500 in
In another feature of the present invention, when the user moves a pointer, such as a mouse cursor or a light pen, over a block, a readout, for example the range of dates in window 536 as shown in
The sub-groups in each of columns 510 through 520 are indicated by a series of adjacent blocks, such as blocks 540 through 546 for rank column 510. When columns 510 through 520 are initially generated, only first stratum 509 is displayed. One embodiment of the present invention includes another feature in client program instructions 302 that generates a successive stratum when the user selects a block in a preceding stratum. For example,
A user may also select more than one block in a stratum, as shown, for example, by blocks 554 and 556 in
Another feature of table 534 is color-coded portions 558 through 566 that indicate which sub-group the list of documents displayed in table 534 correspond to for the selected category. Table 534 includes color-coded portions 558 through 566 that correspond to the color-coding of the blocks in columns 510 through 520. This allows the user to readily see which sub-group of the selected category the documents being shown in table 534 belong to.
The width of the blocks in each stratum represents the relative number of records in the cluster represented by a block. Thus, the wider a block is, the more records it includes. Additionally the height of a block indicates the relative number of records contained in that block's stratum. These height and width indicators provide another visual cue of the distribution of the documents according to the various categories for which information is available, and allows the user to visually determine which cluster is likely to contain relevant information. For example, a user may find that only documents from a selected time frame would be relevant. In this situation, the user could select the block containing documents that are near the desired date, with the result that only those documents would be shown in table 534. This feature allows the user to navigate through a reduced number of documents to find those that are most relevant, thereby saving time. Notably, the user may select one or more blocks from one or more different columns to generate a list of documents in table 534 that meet criteria in two or more categories, for example, size and date.
To further facilitate navigation, the URL for a document listed in table 534 may be accessed in another browser frame when the listing is selected with a selection device, such as clicking a mouse cursor or light pen over the document listing.
Another feature that may be implemented in an embodiment of the present invention is group column 520. The clustering algorithm automatically groups similar records of the documents found in the search together. Group column 520 allows the user to select a cluster and examine the blocks in the new stratum. The widths of the blocks in the new stratum will allow the user to evaluate the breakdown of the groups and why records are assigned to a given group.
Referring back to flow diagram 400 in
Processes 428 through 430 show that the URL for a document is accessed and the corresponding web page is displayed in another browser frame when a document is selected from the list of documents in table 534.
When a column button, such as one of column buttons 522 through 532, is selected, processes 432 and 434 show that the list of documents in table 534 is resorted in ascending or descending order with respect to the criteria corresponding to the column selected.
While the invention has been described with respect to the embodiments and variations set forth above, these embodiments and variations are illustrative and the invention is not to be considered limited in scope to these embodiments and variations. For example, the present invention may be used to deliver personalized advertising to the client. Preferences for advertising content may be specified by the user, or the client program instructions 302 or server program instructions 304 could choose advertisements based on the topic(s) being searched by the user. Accordingly, various other embodiments and modifications and improvements not described herein may be within the spirit and scope of the present invention, as defined by the following claims.
The present reissue application is a continuation of reissue application U.S. application Ser. No. 11/256,615 filed Oct. 21, 2005 now U.S. Pat. No. Re. 42,262, which is a reissue application of U.S. application Ser. No. 09/385,149 filed Aug. 30, 1999, now U.S. Pat. No. 6,636,853. More than one reissue application has been filed.
Number | Name | Date | Kind |
---|---|---|---|
5530852 | Meske et al. | Jun 1996 | A |
5649186 | Ferguson | Jul 1997 | A |
5722418 | Bro | Mar 1998 | A |
5761662 | Dasan | Jun 1998 | A |
5784608 | Meske et al. | Jul 1998 | A |
5857179 | Vaithyanathan | Jan 1999 | A |
5991756 | Wu | Nov 1999 | A |
6023701 | Malik et al. | Feb 2000 | A |
6070157 | Jacobson | May 2000 | A |
6141007 | Lebling et al. | Oct 2000 | A |
6185553 | Byrd et al. | Feb 2001 | B1 |
6189019 | Blumer et al. | Feb 2001 | B1 |
6199099 | Gershman et al. | Mar 2001 | B1 |
6202058 | Rose et al. | Mar 2001 | B1 |
6243713 | Nelson et al. | Jun 2001 | B1 |
6275829 | Angiulo et al. | Aug 2001 | B1 |
6289350 | Shapiro et al. | Sep 2001 | B1 |
6327574 | Kramer et al. | Dec 2001 | B1 |
6370535 | Shapiro et al. | Apr 2002 | B1 |
6385602 | Tso | May 2002 | B1 |
6393469 | Dozier et al. | May 2002 | B1 |
6999959 | Lawrence | Feb 2006 | B1 |
Number | Date | Country |
---|---|---|
09231238 | Sep 1997 | JP |
10143517 | May 1998 | JP |
11213008 | Aug 1999 | JP |
9738378 | Mar 1997 | WO |
9710537 | Oct 1997 | WO |
Entry |
---|
U.S. Appl. No. 11/256,615, filed Oct. 21, 2005; Entitled: Method and apparatus for representing and navigating search results. |
Zamir et al., “Web Document Clustering: A Feasibility Demonstration”, ACM 1998. |
Hirtle et al., “Clusters on the World Wide Web: creating neighborhoods of make-believe”, ACM 1998. |
Mukherjea et al., “Using Clustering and Visualization for Refining the Results of a WWW Image Search Enginge”, ACM 1998. |
Sebrechts et al., “Visualization of Search Results: a comparative evaluation of text, 2D, and 3D interfaces”, ACM Aug. 1999. |
Roussinov et al., “Interactive Internet Search through Automatic Clustering: An Empirical Study”, ACM Aug. 1999. |
Nowell at al., “Visualizing Search Results: Some Alternatives to Query-Document Similarity”, ACM 1996. |
Terveen et al., “Constructing, Organizing, and Visualizing Collections of Topically related Web resources”, AMC Mar. 1999. |
Number | Date | Country | |
---|---|---|---|
Parent | 11256615 | Oct 2005 | US |
Child | 09385149 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09385149 | Aug 1999 | US |
Child | 11513838 | US |