It is subjectively clear that some users are “better” than others at searching the Web. Some people are skilled searchers who have searched the Web hundreds of times per day for the last ten years, while other searchers are novices who have only searched the Web once or twice. Being “good at Web searching”, for purposes of this document, is independent of a searcher's domain-specific knowledge, and does not mean that the searcher has a great deal of knowledge about the material for which he is searching. Instead, a user's general skill level and efficiency at searching the Web dictates whether the searcher is a good searcher. For example, certain searchers are better at search tasks such as formulating search queries, quickly evaluating search results, and integrating information from multiple search systems.
Formulating search queries is an important problem that has been studied extensively in recent years. It has been shown that the inclusion of advanced operators offered by Web search engines in queries can lead to improved search performance. These advanced operators include plus (+), minus (−), double quotes (“ ”), “site:” (an operator that restricts the search to a particular host or Web domain), and “weight:” (an operator that expresses the importance or weight of a query term relative to other query terms). One problem, however, is that novice users generally do not use these operators. This is because they may not be aware of the existence of advanced operators or may not understand the positive impact that the utilization of advanced syntax can have on performing productive searches.
One of the most important aspects of the search query formulation process is the ability to effectively transform or refine a search query. Typically, a searcher desires to refine a search query if relevant or desired results are not retrieved with the initial query. However, novice searchers frequently have difficulty refining a query following an unsuccessful search. Another problem is that query refinement typically requires extensive typing and text manipulation, which is difficult when using mobile devices or pen-based computers. This is because typing on these devices generally is an extremely slow operation, and careful replacement of text within a text field is even more difficult. As mobile searches become more prevalent, this limits the utility of the traditional text-based mechanisms for query refinement.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments of the search query transformation system and method include refining an initial search query using a graphical user interface. Search transformation and refinement operations as well as advanced search operators are represented graphically to allow a searcher to refine the initial search query quickly and easily. Because the search transformation is graphical, direct manipulation can be used. Direct manipulation allows graphical objects to be directly manipulated using rapid, reversible, and incremental actions that loosely correspond to the physical world. In other words, direct manipulation means that the searcher is making or driving the changes in the search queries through their interaction with the system.
Supporting search query transformation through direct manipulation is valuable for a number of reasons. In particular using this mechanism, query quality can be improved for novice searchers, and query refinement time can be reduced for mobile users and novice searchers. Moreover, direct manipulation shields a searcher from advanced query syntax, which allows him to focus more on what he wants rather than how to express himself in a way that can be directly interpreted by the search engine. In other words, direct manipulation as used in embodiments of the search query transformation system and method make the search query transformation and revision intuitive to even novice searchers who are unfamiliar with the subtleties of searching the Web.
Embodiments of the search query transformation system and method include a search query re-weighting user interface (UI) component, a search query term replacement UI component, and a search query suggestion component. The search query re-weighting UI component uses graphical UI controls to adjust and re-weight the weights of search terms and adjust grouping of the search terms. The re-weighting UI component enables the formulation of syntactically complex query statements that incorporate advanced search operators. This component both shields the searcher from the potentially confusing underlying search engine mechanics and leverages familiar interface elements to empower searchers. More generally, graphical controls are used along with direct manipulation to allow searchers to create complex, syntactically-rich queries that use advanced operators without the need to manually manipulate the text of the search query. This visualization of a search query formulation creates a representation that may be better understood by the 80% of searchers who are novice searchers and have never used advanced search operators.
For example, some embodiments use sliders to graphically represent the weight of each search term in the search query. Moreover, graphical controls allow a searcher to insert and remove search terms with the touch of a button. Some embodiments include graphical controls for phrase creation and division, such that the searcher can graphically put search terms in quotes to create a phrase or divide search terms out of a phrase. Other embodiments include the visualization of search term weights in search results. For example, visual weight bars can be placed next to each entry in a search result list to allow the searcher to compare how well each search term in a query matched up with the particular search result. This allows the searcher to know how to best refine the initial search query to obtain more relevant search results. Embodiments of the re-weighting UI component also include various techniques for the rapid re-querying of transformed or refined search queries. In general, rapid re-querying provides virtually instantaneous reordering of search results based on a revised search query by pre-fetching and pre-caching a number of results.
The search query term replacement UI component allows a searcher to use graphical controls to replace a search term in a query or add a synonym to the query. In some embodiments a synonym tree is used. The synonym tree is a UI interface component that allows rapid replacement of individual search terms and exploration of alternative queries. This component eases the process of query refinement for novice searchers, and eases the process of selecting new query terms for searchers using mobile or pen-based devices (since no typing is required).
The search query suggestion component provides query refinement recommendations to a searcher that are specifically tailored to the direct-manipulation interface presented in this document. In some embodiments, suggestions are based on expert searcher recommendations, which use interaction log data from search engine query logs and from user interaction with the embodiments of the system and method. These recommendations provide a searcher with knowledge of how an expert searcher would refine the search query at hand and the settings that the expert searcher would make to the components and graphical controls. In some embodiments another source of recommendations is a database of popular searches, which uses the same type of data except from all searchers (not just expert searchers) that may have entered the same or similar queries. A searcher is provided with recommendations to refine a search query using embodiments of the proposed system and method, based on what a majority of searchers with a possibly similar information need have done in the past.
It should be noted that alternative embodiments are possible, and that steps and elements discussed herein may be changed, added, or eliminated, depending on the particular embodiment. These alternative embodiments include alternative steps and alternative elements that may be used, and structural changes that may be made, without departing from the scope of the invention.
Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
In the following description of embodiments of the search query transformation system and method reference is made to the accompanying drawings, which form a part thereof, and in which is shown by way of illustration a specific example whereby embodiments of the search query transformation system and method may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the claimed subject matter.
The computing device 120 includes the search query transformation system 100, a search engine browser 130 for processing search requests, an initial search results list 135 generated by the search engine 130 and residing on the browser 130, and a revised search results list 140. The initial search results list 135 contains a list of search results that are ranked according to relevance of a user-supplied search query. The search results are Web pages that searcher using the search engine browser 130 can hyperlink to by clicking on a certain search result.
The computing device 120 is connected to a network 150. Also connected to the network 150 are a first computer 160 and a second computer 165. The first computer 160 includes a first display device 170 and first input device (such as a first keyboard 175) that allows a first user 180 to submit search queries and obtain search results based on the queries. Similarly, the second computer 165 includes a second display device 185 and a second input device (such as a second keyboard 190) that allows a second user 195 to interface with the search engine 130. Assume that the first user 180 is a skilled searcher (on a continuum of search expertise levels) and the second user 195 is a novice searcher. Based on this information, the search engine browser 130 augmented by the search query transformation system 100 could be displayed on the second display device 185 since novice searchers would greatly benefit from this arrangement. On the other hand, since the first user 180 is a skilled searcher, the browser displayed on the first display device 170 may not include the search query transformation system 100. However, the first user's search behavior may be logged by the system 100 to aid in suggesting search queries to the second user 185 (the novice searcher).
Instead of using advanced syntax or careful term-by-term query modification, embodiments of the system 100 include interface mechanisms that allow rapid query iteration and refinement using a pointing device (such as a mouse). Embodiments of the system 100 use the principle of “direct manipulation.” Direct manipulation means that objects of interest can be directly manipulated by a user through actions that correspond at least loosely to the physical world. Direct manipulation is characterized by rapid, reversible, incremental actions and immediate feedback.
The system 100 includes a search query re-weighting user interface (UI) component 210, a search query term replacement UI component 215, and a search query suggestion component 220. Generally, the search query re-weighting user interface (UI) component 210 allows a searcher to adjust weights of search terms already in initial search query 200 through a graphical user interface. The search query term replacement UI component 215 allows the searcher to replace search query terms with other terms, possibly by showing the searcher synonyms for a search term. The searcher interacts with this component 215 through a graphical user interface. Finally, the search query suggestion component 220 allows the searcher to obtain recommendations for search query revisions based on how others have revised their search queries.
The search query re-weighting UI component 210 includes a variety of components and controls that it can use in a graphical user interface environment. In particular, the searcher can perform graphical search term re-weighting 230 that reassigns weights of search query terms, and can perform graphical search term insertion and removal 235. The UI component 210 also can allow a searcher to perform phrase creation and division 240 to create phrases from the search query terms and also divide them. Search term weight visualization also may be used to present a visual indication to the searcher of how well the search query terms match search results. The UI component 210 also can achieve a rapid re-querying of transformed search queries such that revised search results are provided quickly to the searcher.
The search query term replacement UI component 215 can include a synonym tree 255. The synonym tree 255 provides the searcher a way in which to replace search query terms in a graphical manner. The search query suggestion component 220 can propose query suggestions to the searcher that are tailored to the proposed direct manipulation interface with recommendations from at least two sources. In some embodiments a source is an expert searcher recommendation 260, where recommendations are provided based on expert searchers' searching behavior. In other embodiments a source is popular search recommendation 265, where the most popular search revisions from the system 100 are supplied to a searcher seeking to revise his search query.
It should be noted that numerous embodiments of the search query transformation system 100 are possible. For example, embodiments of the system 100 may include any combination of the search query re-weighting UI component 210, the search query term replacement UI component 215, and the search query suggestion component 220. Embodiments may include one, two, three, or all of these components in any variety or combination. Moreover, embodiments of the system 100 can include any combination of the components of the search query re-weighting UI component 210 and the search query suggestion component 220.
As noted above, the search query transformation system and method can be implemented in a number of different embodiments. The components and controls along with their functionality for the embodiments of the system and method will now be discussed.
IIa. Search Term Re-Weighting User Interface Component
In some embodiments the search query transformation system 100 includes a search query re-weighting user interface (UI) component. The re-weighting UI component enhances a user's Web search query refinement through the direct manipulation of graphical representation of search query objects such as terms and advanced query operators. As stated above, these advanced query operators may include plus (+), minus (−), quotes (“ ”), “site:”, and “weight:”. In some embodiments the search query re-weighting UI component includes a slider. As used in this document, the term “slider” is meant to represent the broadest possible interpretation of the term. By way of example and not limitation, in certain embodiments a slider includes a knob or button that moves in a linear fashion (such as back and forth on a straight track), while in other embodiments a slider knob or button moves in a non-linear manner (such as on a circular track). Some of these embodiments are described below.
In
As shown in
In other embodiments the first slider 330 and the second slider 340 may have shapes that are different from those shown in
The effects of manipulating each control now will be discussed. In general, the core functionality of the search query re-weighting UI component includes search term re-weighting, search term insertion and removal, and phrase creation and division. Search term re-weighting is accomplished using the first slider 330, the second slider 340, and the search transformation box 310. Moving slider buttons between terms changes the relative importance of each term and immediately submits the modified search query to the search engine, or causes the re-scoring of search results in browser 130 if client-side rearrangement of search results is supported (as discussed below). The size of the panel than encloses a search term is indicative of a relative weight assigned to each term. This relative weight translates to a numeric score that represents a search term value in relation to the other search terms. These weights are used to depict the importance of each search term and in the ranking of search results. For example, in
The panels 370, 380 of
Term removal and insertion also can be accomplished using the controls shown in
In this particular implementation, it is assumed that the searcher wants to preserve at least one of the search terms in the original search query. In other words, the term removal and insertion feature does not allow the searcher to remove all search terms. In addition, in this embodiment it is assumed that each term that occupies space in the search transformation box 310 is required in the revised search query. The revised search query that would emerge from the first embodiment 300 of the search query re-weighting user interface component given the settings shown in
Phrase creation and division is another feature of the first embodiment 300 of the search query re-weighting user interface component shown in
A third search results 535 has a corresponding third weight bar 540 that is colored about one-third red, one-third green, and one-third blue. This means that the search terms “microsoft”, “windows”, and “vista” match this third search result 535 equally well. In contrast, a fourth search result 545 has a corresponding fourth weight bar 550 that is almost entirely red. This indicates that the search term “microsoft” matched extremely well but that the search term “windows” did not match the fourth search result 545 very well. Similarly, a fifth search result 555 includes a corresponding fifth weight bar 560 that is almost entirely green. This means that the search term “windows” matched quite well but that the search term “microsoft” did not match the fifth search result 555 very well. A sixth search result 565 includes a corresponding sixth weight bar 570 that contains a small amount of red, a small amount of green, and is mostly blue. This indicates that the search terms “microsoft” and “windows” did not match the sixth search result 565 very well, but that the search term “vista” matched much better as compared to “microsoft” and “windows”.
The search term weight visualization component 500 shown in
As a searcher uses the search query re-weighting UI component and moves sliders and forms phrases between search terms, the list of search results instantly updates to reflect the current slider position (and the internal query representation created by the slider settings). This real-time visual feedback allows the searcher to make more informed (and immediate) choices about whether her search query has been effective in retrieving relevant information. This is much better than having to resubmit her search query several times to a search engine and wait for a response. The direct and immediate relationship that exists between the position of the sliders and the grouping of search terms and the results these components are manipulating allows users to rapidly explore the space of retrieved search results.
Revising the search query can potentially generate dozens of new queries per second. Submitting queries to search engines at this rate is not yet practical. In some embodiments of the search query re-weighting UI component, a separate list of “popular destinations” that other searchers have visited is manipulated. This mitigates the need to contact the search engine at each iteration, as the list of destinations is based on cached interaction log data not search engine results. In other embodiments, rapid re-querying is achieved by batching several queries for nearby positions of each slider and submitting them proactively to the search engine. This allows search results for many possible UI interactions to be pre-cached and ready for immediate viewing. In yet another embodiment of rapid re-querying, a large but practical number of search results (on the order of about 1000 search results) are requested from the search engine. These search results then are re-ranked locally according to the search term weighting as the searcher manipulates the search query re-weighting UI component. New queries with appropriately modified weights can be resubmitted in the background at a maximum possible rate.
In
In addition, the third embodiment 700 includes N sliders, where N is the number of search query terms. As shown in
IIb. Search Query Term Replacement User Interface Component
The search query re-weighting UI component described above provides a direct-manipulation user interface for refining queries by manipulating search term weights and phrasing. However, the component does not allow rapid substitution of search query terms with other search query terms. In the same scenarios for which sliders are beneficial (such as for allowing novice searchers to refine their queries and for allowing rapid query refinement when typing is cumbersome), it is desirable to provide a searcher with a graphical way in which to rapidly substitute search terms.
To this end, in some embodiments the search query transformation system 100 includes a search query term replacement user interface (UI) component. In some embodiments this UI is a direct-manipulation user interface component called a “synonym tree”. The synonym tree allows rapid, keyboard-free, in-place replacement of search query terms with suggested substitutions. Note that while this is similar to the “suggested queries” functionality currently available from major search engines, that functionality operates only on a whole-query level. In contrast, the synonym tree allows more subtle refinement of queries, thus allowing an easier way to refine or expand an existing search query.
By way of example, if the searcher clicks on the synonym term “great”, the revised search query would be [good great interface toolkits]. On the other hand, if the searcher clicks on the plus sign next to the synonym term “great”, then the revised search query would be [(good or great) interface toolkits]. This is a very important search query pattern that is currently difficult to build in most search interfaces. The search typically does not care in this case whether he finds the word “great” or “good”, only that there is a positive sentiment expressed with one of these words. The synonym tree 820 makes it easy to rapidly expand the initial search query 810 by “or'ing together” words that are semantically equivalent to the user, requiring that the search results contain any of the equivalent terms, and thereby improving result coverage.
The synonym tree 820 presents a set of search terms that the system 100 determines are suitable replacements for this term in the initial search query 810. In some embodiments of the synonym tree 820, search terms are drawn from a thesaurus (a list of words that are similar according to language semantics), from a list of common replacements observed in previous queries, or both. Selection made from a list of common replacements observed in previous queries allows the synonym tree 820 to choose terms that are appropriate replacements according to the semantics of the Web, rather than just the semantics of the language. In this example, a thesaurus might tell us that “great” or “excellent” are synonyms for “good”, but an experienced Web searcher might in fact choose to replace “good” with “review”, which we can determine from search query logs but not from a thesaurus. In some embodiments this type of replacement suggestion can be particularly drawn from search query logs corresponding to expert Web searchers who have a strong understanding of this “Web semantics” replacement strategy.
Note that because only a finite set of possible replacement queries exist for a given search query, it is possible for the system 100 to begin pre-fetching all possible replacement queries immediately after the searcher executes a revised search query, allowing very rapid interaction. For example, when the searcher submitted the query “good interface toolkits”, the system 100 would concurrently submit the queries “excellent interface toolkits”, “great interface toolkits”, and so forth. When the searcher clicks on one of the terms, an additional set of queries can be speculatively executed, including probably combinations of terms, such as [(great or good or review) interface toolkits].
IIc. Search Query Suggestion Component
Some embodiments of the search query transformation system 100 include a search query suggestion component. In a first embodiment of the search query suggestion component is an expert searcher recommendation. This embodiment uses the queries of expert searchers to recommend possible settings for the sliders, or appropriate content for the synonym tree. While current search systems provide “query suggestion” (popular queries offered to users for query refinement), this embodiment offers “advanced query suggestions” that are the most popular versions of the current query that include advanced operators translated into the query abstraction offered by the slider. In other words, this would convey the recommended settings for the system 100 without the need for many searchers to interact with it to generate these data. In other words, this would effectively translate queries submitted by more expert searchers into a visual representation that may be more easily understood by novice searchers. For example, if experts often type a query like [+(microsoft, weight:0.6)+(windows, weight: 0.4)−vista], the query slider can show this query to novices as a suggested new query without exposing the complex syntax. Similarly, “or” groupings made frequently by expert searchers could result in visual highlights on the plus-signs in the synonym tree 820, alerting novice searchers that expert searchers frequently “or” these terms together, without having to explain or show complex Boolean syntax.
Another embodiment of the embodiment of the search query suggestion component is a popular search recommendation. This embodiment presents information on how sliders in the system 100 are being used by many other searchers for the current query. In other words, the most popular settings of the system 100 from many searchers for a particular search query. In some embodiments, this is represented as a line graph with the peaks representing the most used positions for the slider buttons. Query suggestions (typically offered as a list of options separate from the search results) can also be integrated into this embodiment, making the query reformulation process more consistent and removing the unnecessary separation that exists between query entry and query suggestions on many current Web interfaces.
Embodiments of the search query transformation system and method are designed to operate in a computing environment. The following discussion is intended to provide a brief, general description of a suitable computing environment in which embodiments of the search query transformation system and method may be implemented.
Embodiments of the search query transformation system and method are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with embodiments of the search query transformation system and method include, but are not limited to, personal computers, server computers, hand-held (including smartphones), laptop or mobile computer or communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Embodiments of the search query transformation system and method may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. Embodiments of the search query transformation system and method may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices. With reference to
Components of the computer 1010 may include, but are not limited to, a processing unit 1020 (such as a central processing unit, CPU), a system memory 1030, and a system bus 1021 that couples various system components including the system memory to the processing unit 1020. The system bus 1021 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.
The computer 1010 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by the computer 1010 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 1010. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
The system memory 1040 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 1031 and random access memory (RAM) 1032. A basic input/output system 1033 (BIOS), containing the basic routines that help to transfer information between elements within the computer 1010, such as during start-up, is typically stored in ROM 1031. RAM 1032 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 1020. By way of example, and not limitation,
The computer 1010 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 1041 is typically connected to the system bus 1021 through a non-removable memory interface such as interface 1040, and magnetic disk drive 1051 and optical disk drive 1055 are typically connected to the system bus 1021 by a removable memory interface, such as interface 1050.
The drives and their associated computer storage media discussed above and illustrated in
Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, radio receiver, or a television or broadcast video receiver, or the like. These and other input devices are often connected to the processing unit 1020 through a user input interface 1060 that is coupled to the system bus 1021, but may be connected by other interface and bus structures, such as, for example, a parallel port, game port or a universal serial bus (USB). A monitor 1091 or other type of display device is also connected to the system bus 1021 via an interface, such as a video interface 1090. In addition to the monitor, computers may also include other peripheral output devices such as speakers 1097 and printer 1096, which may be connected through an output peripheral interface 1095.
The computer 1010 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 1080. The remote computer 1080 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 1010, although only a memory storage device 1081 has been illustrated in
When used in a LAN networking environment, the computer 1010 is connected to the LAN 1071 through a network interface or adapter 1070. When used in a WAN networking environment, the computer 1010 typically includes a modem 1072 or other means for establishing communications over the WAN 1073, such as the Internet. The modem 1072, which may be internal or external, may be connected to the system bus 1021 via the user input interface 1060, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 1010, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
The foregoing Detailed Description has been presented for the purposes of illustration and description. Many modifications and variations are possible in light of the above teaching. It is not intended to be exhaustive or to limit the subject matter described herein to the precise form disclosed. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims appended hereto.