Users enter a variety of queries into the search boxes of search engines. While entering such queries, search engines may generate suggestions regarding the query that the user is currently entering into the search box. For example, suggested queries may be generated by a search engine providing an auto-suggest functionality that completes the un-entered characters in a term while the user is entering the characters at the beginning of the term. Such an auto-suggest functionality presents multiple variations of terms, and multiple options for completing an incomplete query. In presenting multiple variations for completing the characters in a term, queries are “expanded,” and users may select the expanded query that was generated using the auto-suggest functionality.
In some instances, while a search engine is presenting expanded queries for terms being entered, the search engine is also generating and displaying search results to the user based on the expanded queries. Although these search results may or may not be relevant to the completed query that the user eventually submits, the combination of auto-suggest completion of a query term and the automatic generation of query results are provided in order to assist users in retrieving the most relevant search results. However, in other instances, users entering lengthy queries with multiple terms into a search box may not utilize the auto-suggest functionality to complete individual terms, and also may not utilize the display of query results prior to completion of the user's intended search.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Embodiments of the present invention relate to user query reformulation in association with a search box. As differentiated from an auto-suggest feature that expands incomplete queries of all lengths, query reformulation refers to the reformulation of user queries that include a plurality of terms already entered by a user. In embodiments, query reformulation is performed on queries that include a particular number of terms that satisfy a threshold. Having received a user query with a plurality of terms that satisfies a threshold, a set of reformulated user queries is determined. Reformulated user queries are presented in association with the search box that received the initial user query, prior to the generation of search results satisfying the user query.
A set of reformulated user queries includes one or more member queries. The member queries include one or more suggestions for a reformulated user query, such as a suggested query term alternation and/or a suggested query term deletion. In one embodiment, reformulated user queries are ranked before being presented to a user. For example, ranked suggested query term alterations and ranked suggested query term deletions may be presented to a user in an order that is most relevant to the user's original query. In another embodiment, reformulated user queries are categorized into groups before being presented to a user in association with such groups. For example, the member queries of a set of reformulated user queries may be grouped into suggested query term alterations and suggested query term deletions.
In further embodiments, member queries in a set of reformulated user queries are presented to a user for selection, in association with a search box. Based on a user's selection of a suggested query term alteration or a suggested query term deletion, query results that satisfy the selected member query are generated. In one embodiment, a selection option is provided for a user to input additional terms in association with the original user query. Having received an additional term, a second set of reformulated user queries may be generated. Alternatively, query results that satisfy a new user query that includes the terms of the original user query and the additional terms input by the user.
The present invention is described in detail below with reference to the attached drawing figures, wherein:
The subject matter of the present invention is described with specificity herein to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Moreover, although the terms “step” and/or “block” may be used herein to connote different elements of methods employed, the terms should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.
Embodiments of the present invention are generally directed to reformulating user queries in association with a search box. More particularly, reformulated user queries are determined in response to a user query that satisfies a threshold. In some embodiments, the member queries in a set of reformulated user queries are presented to a user. Based on the user's selection of one of the member queries, query results satisfying the selected member query are generated.
In embodiments, reformulated user queries include suggested query term alterations and suggested query term deletions. A suggested query term alteration refers to a reformulated version of the entered user query with at least one of the terms replaced by another term. For example, a reformulated version of the query “verizon wireless phone” may include a suggested query term alteration of “verizon DSL phone,” having the term “wireless” replaced with the term “DSL” in the suggested query term alteration. In embodiments, a query term alteration includes replacing a term and/or a phrase including more than one term. A suggested query term deletion refers to a reformulated version of the entered user query with at least one of the terms removed. For example, a suggested query term deletion for the original query “verizon wireless phone” may include “verizon wireless phone,” with the term “verizon” removed.
Reformulated user queries may be ranked, categorized into groups, and/or presented to a user for selection. Based on a user's selection of a reformulated user query, a number of query results that satisfy the selected reformulated user query are provided. Alternatively, a second set of reformulated user queries may be generated based on a user selection of a reformulated user query. In one embodiment, a selection option is provided for a user to input one or more additional terms. The terms of the original user query and the additional input terms may be used to generate a second set of reformulated user queries. Additionally, a number of query results satisfying the terms of the original user query and additional input terms may be generated.
Accordingly, one embodiment of the present invention is directed to one or more computer-readable media storing computer-useable instructions that, when used by one or more computing devices, causes the one or more computing devices to perform a method of query reformulation. The method comprises: receiving a first user query in association with a search box, the first user query including a plurality of terms; determining that the received first user query satisfies a threshold; and based on the received first user query, determining a first set of reformulated user queries, wherein the first set includes one or more member queries in association with the search box, further wherein the one or more member queries comprises at least one of the following: (1) one or more suggested query term alterations, wherein each of the one or more suggested query term alterations are determined based on replacing at least one term in the received first user query; and (2) one or more suggested query term deletions, wherein each of the one or more suggested query term deletions are determined based on removing at least one term in the received first user query.
In another embodiment, the invention is directed to a method performed by one or more server devices for reformulating user queries. The method comprises: receiving a first user query in association with a search box, the first user query including a plurality of terms; determining that the plurality of terms in the first user query satisfies a threshold; determining a first plurality of reformulated user queries in association with the search box, the first plurality of reformulated user queries comprising: (1) one or more query term alterations, wherein each of the one or more query term alterations are determined based on replacing at least one term in the received first user query; and (2) one or more query term deletions, wherein each of the one or more query term deletions are determined based on removing at least one term in the received first user query; categorizing each of the first plurality of reformulated user queries into one or more groups, the one or more groups comprising: (1) the one or more query term alterations; and (2) the one or more query term deletions.
A further embodiment of the present invention is directed to a graphical user interface stored on one or more computer-storage media and executable by a computing device. The graphical user interface comprises: a search box for receiving a user query, the user query having a plurality of terms; and one or more of the following sections: (1) a section that displays one or more query term alterations in association with the search box, wherein each of the one or more query term alterations are determined based on replacing at least one term in the received user query; and (2) a section that displays one or more query term deletions in association with the search box, wherein each of the one or more query term deletions are determined based on removing at least one term in the received first user query.
Having described an overview of embodiments of the present invention, an exemplary operating environment in which embodiments of the present invention may be implemented is described below in order to provide a general context for various aspects of the present invention. Referring initially to
The invention may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions such as program modules, being executed by a computer or other machine, such as a personal data assistant or other handheld device. Generally, program modules including routines, programs, objects, components, data structures, etc., refer to code that performs particular tasks or implements particular abstract data types. Embodiments of the invention may be practiced in a variety of system configurations, including hand-held devices, consumer electronics, general-purpose computers, more specialty computing devices, etc. Embodiments of the invention may also be practiced in distributed computing environments where tasks are performed by remote-processing devices that are linked through a communications network.
With continued reference to
The computing device 100 typically includes a variety of computer-readable media. Computer-readable media can be any available media accessible by the computing device 100 and includes both volatile and nonvolatile media, and removable and non-removable media, implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer-readable media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computing device 100. Combinations of any of the above are also included within the scope of computer-readable media.
The memory 112 includes computer-storage media in the form of volatile and/or nonvolatile memory. The memory may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard drives, optical-disc drives, etc. The computing device 100 includes one or more processors that read data from various entities such as the memory 112 or the I/O components 120. The presentation component(s) 116 present data indications to a user or other device. Exemplary presentation components include a display device, speaker, printing component, vibrating component, etc.
The I/O ports 118 allow the computing device 100 to be logically coupled to other devices including the I/O components 120, some of which may be built in. Illustrative components include a microphone, joystick, game pad, satellite dish, scanner, printer, wireless device, etc.
As indicated previously, embodiments of the present invention are directed to reformulating user queries in association with a search box. A reformulated user query refers to a user query with one or more terms altered, replaced, deleted, removed, corrected for spelling and/or grammatical errors, and/or otherwise changed from the originally-submitted user query. Reformulated user queries are determined from user queries that include a plurality of terms. Based on the plurality of terms satisfying a predetermined threshold, a set of reformulated user queries are determined. In one embodiment, the threshold for determining a set of reformulated user queries requires that the user query includes three or more terms. For example, while the user query “wireless phone” does not trigger the generation of a reformulated user query, the query “verizon wireless phone” does, according to a threshold requiring three terms in the originally-submitted user query. In embodiments, a user query including more than three terms is referred to as a “long” user query. Such “long” user queries may satisfy the threshold for determining a set of reformulated user queries.
Determining a plurality of reformulated user queries utilizes a variety of sources. In embodiments, reformulated user queries are determined using alteration services, query and session logs, and/or alteration scores. An alteration service provides a list of potential alterations to a term and/or phrase (that includes more than one term) in an original user query and an indication of a confidence level of the relevance of the proposed alterations. Query and session logs refer to sources that provide data retrieved from previously-submitted user queries and previous periods of user interaction. Alteration scores refer to the scores assigned to a reformulated user query based on a determined confidence level that the reformulated user query will provide relevant results. As will be discussed in further detail below, reformulated user queries may also be determined using specificity scores, inverse document frequency, and information gain.
Determining which reformulated user queries to present to a user also utilizes a variety of sources, including query and session logs, query quality predictions, alteration scores, suggested term sources, and/or a web document center. Query quality predictions refers to the quality of results retrieved in response to a particular user query, as describe in full detail in U.S. patent application Ser. No. 12/969,140, entitled “Classifying Results of Search Queries,” having Attorney Docket Number 331078.01/MFCP.157702, filed Dec. 15, 2010, which is hereby incorporated by reference. A suggested term source refers to the use of multiple sources from which to retrieve suggested terms. A web document center provides information regarding the content of webpages retrieved in response to a particular query. For example, if the user queries “verizon wireless phone” and the reformulated user query “cingular wireless phone” retrieve search results with similar content, then a determination may be made that the replaced term in the reformulated user query is an appropriate reformulation candidate, such as a suggested query term alteration.
Using one or more of these sources, a score is generated for each type of reformulated user query, including suggested query term alterations and suggested query term deletions. For example, as set of reformulated user queries may include one or more suggested query term alterations (which may also be referred to as the “member queries” in a reformulated user query set). The suggested query term alterations may be scored using one or more of the listed sources, such as the query and session logs, query quality predictions, alteration scores, and/or suggested term sources. Similarly, the member queries of a reformulated user query set including suggested query term deletions may be scored using a variety of the sources listed above, including query and session logs, query quality predictions, and/or alteration scores.
The scores generated for each reformulated user query are used to rank the reformulated user queries. Such ranking may be done using a machine-learned model that is trained to predict the importance and/or relevance of reformulated user queries. Ranking a reformulated user query in relation to the importance and/or relevance of the reformulated user query refers to prioritizing which reformulated queries are most likely to generate results that are responsive to the user's intended query. For example, ranking may determine that a suggested query term alteration with the first term replaced in a query containing three terms is most relevant to a user's intended query. As such, suggested query term alterations with the first terms replaced may be listed near the top of a plurality of member queries presented to a user.
In one embodiment, reformulated user queries may be ranked using a machine-learned model that is trained to predict which term variations (in either a suggested query term alteration or a suggested query term deletion) provides the most relevant search results in relation to the original user query. In further embodiments, additional tools are used to enhance the accuracy of a machine-learned model, such as random flight, alteration scores, positional bias, and the like. As will be understood, the use of a machine-learned model to rank reformulated user queries, and subsequently determining the order in which to present the reformulated user queries to a user, is not limited to one source of information or one method of data generation.
In embodiments, reformulated user queries are presented to a user according to a ranking. For example, higher-ranked reformulated user queries are presented above lower-ranked reformulated user queries. In further embodiments, in addition to rakings that are based on assigned scores, user queries may be presented to a user based on individual logic pertaining to the type of reformulated user query. For example, one suggested query term alterations logic may present member queries in the order of terms that are replaced, such as listing first-term replaced member queries above member queries with a second term replaced. As will be discussed in detail below, suggested query term alterations may be presented to a user based on one associated logic, while suggested query term deletions may be presented to a user based on a different associated logic. As such, although similar sources may be utilized to generate reformulated user queries based on a submitted user query, determining which suggested query term alterations and which query term deletions to display may utilize separate logic.
As shown in
Suggested query term alterations 214 includes member queries 216 which are reformulated user queries with replaced terms. As shown in
Suggested query term deletions 218 includes member queries 220 which are reformulated user queries with removed terms. As shown in
In a further embodiment, a term's specificity score is used to determine which term to remove and/or delete from a user query 212 when determining member queries 220. A specificity score refers to the degree of specificity of a term. In embodiments, “specificity,” or “selectional preference,” of a term t is defined as the divergence between the unigram model of the query language and the unigram model of the sub-language of queries containing t. As such, a score based on such specificity may be used to determine which term to remove and/or delete from a user query 212 when determining member queries 220.
Similarly, in further embodiments, a term's inverse document frequency may be used to determine whether it should be removed and/or deleted from a user query 212. A term's inverse document frequency refers to an equation dividing one by the number of documents on the internet in which the term occurs. As such, a lower inverse document frequency score correlates to a less-specific query term, which further suggests that the term is a better candidate for deletion/removal as part of the member queries 220 in suggested query term deletions 218.
In another embodiment, an alteration service is used to determine member queries 220 for suggested query term deletions 218. For example, an alteration service may detect particular phrases within a user query 212, such as the phrase “wireless phone.” Such phrasal detection may then be used to generate an inverse document frequency for the detected phrase. This may also be referred to as the detection of frequency of bigrams, or pairs of words, on the internet. In further embodiments, information game may be used to determine how well a term in the user query 212 fits with other documents on the internet, which is in turn used to determine which terms to remove.
Suggested query term additions 222 provides an additional query 224, with the original user query 226 and a selection option 228 for indicating that a user intends to add an additional term to the original user query 226. In one embodiment, a user may select the selection option 228 to indicate that the user intends to enter an additional query term. Upon selection of the selection option 228, an additional query term entered by a user may automatically populate the search box 210. Alternatively, an additional query term may be entered in an additional text input box presented to a user based on selection of the selection option 228. While a user is entering an additional term in association with query term additions 222, member queries 216 in suggested query term alterations 214 and member queries 220 in suggested query term deletions 218 remain static, such that a user can view the member queries 216 and 220 in each section while determining which term to add to the original user query 212.
In one embodiment, having entered an additional term, the new user query (including the original user query 212 and the additional term added in association with query term additions 222) is used to retrieve a plurality of search results that satisfy the new user query. In another embodiment, the new user query populates the search box 210, and new sets of member queries 216 and 220 are generated for the new user query.
Referring now to
Turning now to
With reference now to
At block 516, the plurality of member queries in the first set are presented to a user, with each member query being selectable. At block 518, a user selection of one of the member queries is received. At block 520, a second set of reformulated user queries is determined. The second set of reformulated user queries includes a plurality of member queries. While the first set of member queries determined at block 514 is determined based on the original user query received at block 510, the second set of reformulated user queries is based on the member query selected at block 518.
Referring next to
At block 618, a user selection of one of the member queries is received. For example, as illustrated in
At block 624, based on the selection option presented at block 616, additional terms are input by a user. At block 626, a second set of reformulated user queries are determined in response to the additional term input by the user. Alternatively at block 628, a plurality of query results that satisfy the terms of the original user query and the additional input term may be generated. As previously discussed with reference to
Turning now to
Referring finally to
As can be understood, embodiments of the present invention provide a method of reformulating user queries in association with a search box. The present invention has been described in relation to particular embodiments, which are intended in all respects to be illustrative rather than restrictive. Alternative embodiments will become apparent to those of ordinary skill in the art to which the present invention pertains without departing from its scope.
From the foregoing, it will be seen that this invention is one well adapted to attain all the ends and objects set forth above, together with other advantages which are obvious and inherent to the system and method. It will be understood that certain features and subcombinations are of utility and may be employed without reference to other features and subcombinations. This is contemplated by and is within the scope of the claims.