Not Applicable
Not Applicable
Not Applicable
1. Field of Invention
This invention relates to language-based search methods.
2. Review and Limitations of the Prior Art
Today's search engines do not produce search results that are driven, consolidated, and summarized by the major contextual uses of a search input phrase. As a result, search results can be a disorganized jumble of different contextual uses of the search input phrase. This forces a user to wade through pages of results (manually scanning titles, text snippets, and URLS) in order to pick out those entries that relate to the context for the phrase in which the user is interested. For example, if one enters the phrase “keep running” in the search box of today's dominant search engine, then one gets a page of results with a jumble of entries that bounce around from health and fitness . . . to musical songs . . . to computer programs . . . and even to nuclear power plants. There is no organization or summary of results by phrase context to guide the user. Entries with the desired context are randomly-sprinkled throughout several pages of results. This makes poor use of the user's time.
The designers of today's search engines are no doubt aware of this problem of context-jumbled searches. This is probably why they have developed features such as an interactive search box that provides a user with a pop-up menu of auto-completion search phrase options, in real time, as the user enters characters into the search box. This interactive search box is probably intended to help clarify the desired context for the input phrase in an effort to reduce the problem of context-jumbled searches. However, an interactive search box does not satisfactorily address the fundamental flaw of search engines that are not driven by search phrase context. For example, a search box that offers auto-completion options for words to the right of the characters that have been entered does not provide holistic variation of the entire input phrase. For example, it does not offer variation in word order or any other phrase variation that changes the characters that have already been entered. An interactive search box with a pop-up window is not a satisfactory means through which to refine search context and to provide holistic variation of the search input phrase. It is limited in terms of user time, screen space, and character entry order effects.
The prior art includes several different methods for creating variation in search queries, including not only interactive search boxes, but also automatic creation of variation in search queries. Examples in the prior art with methods for creating variation in search queries include the following: U.S. Pat. No. 5,265,065 (Turtle, 1993), U.S. Pat. No. 5,418,948 (Turtle, 1995), U.S. Pat. No. 5,675,819 (Schuetze, 1997), U.S. Pat. No. 5,933,822 (Braden-Harder et al., 1999), U.S. Pat. No. 5,963,940 (Liddy et al., 1999), U.S. Pat. No. 6,026,388 (Liddy et al., 2000), U.S. Pat. No. 6,185,576 (McIntosh, 2001), U.S. Pat. No. 6,321,224 (Beall et al., 2001), U.S. Pat. No. 6,327,590 (Chidlovskii et al., 2001), U.S. Pat. No. 6,510,406 (Marchisio, 2003), U.S. Pat. No. 6,519,585 (Kohli, 2003), U.S. Pat. No. 6,751,611 (Krupin et al., 2004), U.S. Pat. No. 6,766,320 (Wang et al., 2004), U.S. Pat. No. 7,051,023 (Kapur et al., 2006), U.S. Pat. No. 7,231,343 (Treadgold et al., 2007), U.S. Pat. No. 7,231,379 (Parikh et al., 2007), U.S. Pat. No. 7,260,567 (Parikh et al., 2007), U.S. Pat. No. 7,287,025 (Wen et al., 2007), U.S. Pat. No. 7,346,490 (Fass et al., 2008), U.S. Pat. No. 7,370,056 (Parikh et al., 2008), U.S. Pat. No. 7,398,201 (Marchisio et al., 2008), U.S. Pat. No. 7,428,529 (Zeng et al., 2008), U.S. Pat. No. 7,440,941 (Borkovsky et al., 2008), U.S. Pat. No. 7,475,063 (Datta et al., 2009), U.S. Pat. No. 7,536,408 (Patterson, 2009), U.S. Pat. No. 7,562,069 (Chowdhury et al., 2009), U.S. Pat. No. 7,580,921 (Patterson, 2009), U.S. Pat. No. 7,580,929 (Patterson, 2009), U.S. Pat. No. 7,584,175 (Patterson, 2009), U.S. Pat. No. 7,599,914 (Patterson, 2009), U.S. Pat. No. 7,599,930 (Burns et al., 2009), U.S. Pat. No. 7,630,978 (Li et al., 2009), U.S. Pat. No. 7,630,980 (Parikh, 2009), U.S. Pat. No. 7,634,462 (Weyand et al., 2009), and U.S. Pat. No. 7,636,714 (Lamping et al., 2009), and U.S. Patent Application Nos. 20060206474 (Kapur et al., 2006), 20070106937 (Cucerzan et al., 2007), 20080091670 (Ismalon, 2008), 20080114721 (Jones et al., 2008), 20080215564 (Bratseth, 2008), 20080319962 (Riezler et al., 2008), 20090083028 (Davtchev et al., 2009), 20090193008 (De et al., 2009), 20090216737 (Dexter, 2009), 20090259647 (Curtis, 2009), and 20090327269 (Paparizos et al., 2009).
However, none of the prior art provides a context-driven search method whose results are driven, grouped, and summarized according to the major contextual uses of the search input phrase. Such a context-driven method would correct the context-related problems noted above, so that users would no longer have to wade through pages of results to pick out those entries for the context in which they are interested. The invention disclosed herein is a context-driven search method that directly addresses and solves these context-related problems with search methods in the prior art.
The invention disclosed herein is a context-driven search method comprising three main steps: (1) having a user provide an input phrase that is used to search a collection of language-based information sources; (2) identifying sets of substantially-equivalent expanded phrases that are relevant to the input phrase, wherein these expanded phrases appear in the collection of language-based information sources; and (3) providing the user with set-specific summary information concerning some, or all, of these sets of substantially-equivalent expanded phrases. This innovative context-driven method provides search results that are driven, consolidated, and summarized by phrase context. As a result, users no longer have to wade through pages of results to pick out those entries that relate to the context in which they are interested.
These figures show an example of how this invention may be embodied, but they do not limit the full generalizability of the claims.
The following figures show one embodiment of this invention, but they do not limit the full generalizability of the claims.
In an example, the collection of language-based information sources may be the pages that comprise the Internet. In other examples, the collection of language-based information sources may be selected from one or more sources in the group consisting of: books, journals, magazines, newspapers, reports, emails, datasets with text, voice transcriptions, and files. In an example, the collection of language-based information sources that is searched using this context-driven method may be a subset of a larger collection of language-based information sources that has been initially ranked and selected by applying some other search method.
In various examples, a “minor variation” of a certain phrase may be defined as a variation selected from one or more of the variations in the group consisting of:
In various examples, the positional relationship between the input phrase (or minor variation thereof) and the one or more additional phrases within an expanded phrase is selected from the group of relationships consisting of: one additional phrase preceding the input phrase (or minor variation thereof); one additional phrase following the input phrase (or minor variation thereof); one additional phrase preceding the input phrase (or minor variation thereof) and one additional following the input phrase (or minor variation thereof); and more than two additional phrases preceding or following the input phrase (or minor variation thereof). Also, in various examples, the additional phrase may be identified within a certain distance from the input phrase (or minor variation thereof), wherein this distance is measured by characters or words.
In an example, the results from this context-driven search method may be in the form of several lines on a computer screen, wherein summary information for a given set of substantially-equivalent phrases is shown on a single line. In another example, the results from this context-driven search method may be in the form of several paragraphs on a computer screen, wherein summary information for a given set of substantially-equivalent phrases is shown in a single paragraph. In various examples, set-specific summary information (such as shown on a single line or in a single paragraph) may include: one expanded phrase from the set to conceptually represent all of the variations of substantially-equivalent phrases that comprise that set; the number of times that any expanded phrase in this set appears throughout the collection of language-based information sources; the number of different information sources that include one or more of the expanded phrases in this set; the number of times that prior users have explored detail on this set in previous searches; a measure of the degree of variation in phrase wording among expanded phrases within the set; or other summary indicators to help the user select among sets.
In an example, in step 103 in
The method of context-driven search disclosed herein corrects a significant disadvantage of search engines in the prior art. Search engines that do not organize search results by context generate lists of search results that are jumbled and inefficient for users with respect to contextual use of the search phrase. Such engines in the prior art require a user to wade manually through a list of results in order to pick out only those entries that relate to the context of interest. Entries with a context of interest can be randomly mixed into several pages of search results that include contextual uses of the input phrase that are of no interest to the user.
For example, if one enters the phrase “keep running” into the search box of today's most popular search engine, then one gets a first page of results that have a jumble of entries that conceptually bounce around, in a relatively-random manner, among several unrelated contexts: health and fitness; musical songs; computer programs, and nuclear power plants. There is no organization by phrase context. There is no consolidation of results by context. The user, who probably is not interested in information on both nuclear power plants and musical songs, has to wade manually through the list, scanning source titles and text snippets in an effort to try to find entries with the particular context in which she or he is interested. Even more frustrating, the entries that are relevant to the desired context may be randomly-sprinkled throughout not only the first page of results, but throughout several subsequent pages as well.
The lack of organization by context in today's search engines makes poor use of a user's time. This disadvantage is directly addressed and corrected by the context-driven search method that is disclosed herein. This context-driven search method corrects this problem by automatically grouping and organizing search results by the contextual use of the input phrase in information sources. This context-driven search method helps the user to find those information sources that are most relevant to their desired context in an intuitive and efficient manner.
This context-driven search method also corrects a second, but related, disadvantage of search engines in the prior art. This problem relates to variation in wording. How can a search engine handle variation in wording in the search input phrase vs. wording in the information sources that are being searched? As discussed earlier in this section, variation in wording can include: spelling variation, grammar variation, modifier variation, order variation, case variation, punctuation variation, and phrase synonyms. If a search engine narrowly searches for information sources with only the input phrase that was entered into the search box, then the engine will likely fail to find important sources that contain minor variations on the input phrase. The context-driven search method that is disclosed herein corrects this problem by identifying sets of substantially-equivalent expanded phrases that are relevant to the input phrase.
Today's search engine designers are no doubt aware of the two problems identified above: (1) the lack of organization of search results with respect to contextual use of the input phrase; and (2) missing relevant results due to variation in wording in the search input phrase vs. wording in the information sources. This is probably why they have developed an interactive search box that provides a user with a pop-up menu of auto-completion options as the user enters characters into the search box from left to right. This interactive search box is probably intended to correct these two problems by prompting the user with a menu that: clarifies the desired context for the input phrase to reduce the problem of context-jumbled searches; and encourages the user to use frequently-used (“standard”) key words instead of seldom-used (“non-standard”) words to reduce the problem of missed sources due to word variation.
However, these problems have not been satisfactorily solved by an interactive search box. An interactive search box with a pop-up menu of auto-completion options does not satisfactorily address the fundamental flaw of search engines that are not driven by context. For example, an interactive search box with a pop-up menu of auto-complete options for words to the right of the characters that have been entered thus far by the user does not offer holistic variation of the entire input phrase. It does not offer variation in word order or any other phrase variation that changes the characters that have been previously entered when additional characters are entered. It would probably overwhelm the user to incorporate a wide range of holistic variation in the context of a search box with a pop-up menu. In contrast, the context-driven search method disclosed herein addresses this problem. It automatically identifies and groups sets of substantially-equivalent expanded phrases and presents the user with helpful summary-information concerning those sets. This all happens behind the scenes and does not overwhelm the user. Accordingly, the context-driven search method disclosed herein is a significant improvement over search engines in the prior art, even those with interactive search boxes.
The context-driven search method disclosed herein handles a wide variety of variation in input phrases and provides the user with an efficient, organized summary of sets of expanded phrases that conveys different contexts for the input phrase. Multiple results with information on substantially-equivalent expanded phrases are clearly and conveniently summarized for the user to review and to select the context that is of greatest interest. The user does not have to wade through a jumble of results in different contexts. The user can quickly see the top contexts in which the phrase is used. In an example, the user can click on the desired context to see individual results (by phase or by source) for that contextual usage of the input phrase.
The top portion of
In an example, identification of expanded phrases and grouping them together into sets of substantially-equivalent expanded phrases may be done by software. This software may be based on variations of the input phrase in accordance with the definition of “minor variation” that was provided earlier in this description. In this example, a grammatical variation in verb tense in the input phrase “keep running” is used to identify the expanded phrase 204 “ . . . so that your computer keeps running . . . ” Also in this example, expanded phrase 210 “ . . . can keep running your computer . . . ” includes the input phrase and also the additional phrase “computer,” so it is grouped into the same set of substantially-equivalent expanded phrases as phrase 204. This identification and grouping of substantially-equivalent expanded phrases can occur behind the scenes from the perspective of the user. The user only has to see the useful results of this process that are displayed at the bottom of
In this example, the collection of language-based information sources did not include the phrase “keep on trucking.” In an example that would be more appealing to Grateful Dead fans, the phrase “keep on trucking” would be among the information sources. Ironically, the example herein would likely have included the phrase “keep on trucking” were it not for bothersome memory lapses on the part of the inventor.
The bottom portion of
In this simple example, the set-specific summary information for each set includes: a representative phrase for the set; and the frequency with which any expanded phrase in that set was found in the collection of language-based information sources. In other examples, set-specific summary information may also include: the number of different information sources that include one or more of the expanded phrases in this set; the number of times that prior users have explored detail on this set in previous searches; a measure of the degree of variation in phrase wording among expanded phrases within the set; or other summary indicators.
As an extension of the basic three-step method disclosed herein, each of the set-specific summary results, 211-214, may include links that the user can click to obtain additional information on the individual expanded phrases comprising that set, the individual information sources with expanded phrases from that set, or both.
At first glance, it might appear that this is a two-step search process for a user to get to the individual sources and, as such, that it would be less efficient than the “one-step” process used by current search engines. One might argue that with current search engines, all one has to do is to enter a search phrase and one immediately gets a listing of individual sources.
However, closer analysis considering the role of context shows that this context-driven search method is actually more efficient than current search engines. For reasons that we discussed above, current search engines that do not have an interactive search box have serious problems with respect to providing context-relevant results and capturing all relevant sources with word variation. In contrast, this context-based search engine directly addresses and corrects these problems. Further, even current search engines that do have an interactive search box with auto-completion options ultimately require the user to engage in a three-step process. First, the user must interact with the search box and auto-complete options to enter an input phrase. Second, the user inputs the phrase and the algorithm produces a list of search results. Third, the user must wade through the search results (which are not organized by context) to pick out only those results with a relevant context for the search phrase. In contrast, this context-based search engine provides context-relevant and variation-tolerant results in only two steps. In both cases, the context-based search method is more efficient for the user.
In this example, the process of identifying and grouping expanded phrases that was shown in
The transition of this example from
Number | Name | Date | Kind |
---|---|---|---|
5265065 | Turtle | Nov 1993 | A |
5418948 | Turtle | May 1995 | A |
5675819 | Schuetze | Oct 1997 | A |
5933822 | Braden-Harder et al. | Aug 1999 | A |
5963940 | Liddy et al. | Oct 1999 | A |
6026388 | Liddy et al. | Feb 2000 | A |
6032145 | Beall et al. | Feb 2000 | A |
6178420 | Sassano | Jan 2001 | B1 |
6185576 | McIntosh | Feb 2001 | B1 |
6321224 | Beall et al. | Nov 2001 | B1 |
6327590 | Chidlovskii et al. | Dec 2001 | B1 |
6510406 | Marchisio | Jan 2003 | B1 |
6519585 | Kohli | Feb 2003 | B1 |
6542889 | Aggarwal et al. | Apr 2003 | B1 |
6651058 | Sundaresan et al. | Nov 2003 | B1 |
6721728 | McGreevy | Apr 2004 | B2 |
6751611 | Krupin et al. | Jun 2004 | B2 |
6766320 | Wang et al. | Jul 2004 | B1 |
6901399 | Corston et al. | May 2005 | B1 |
7051023 | Kapur et al. | May 2006 | B2 |
7113943 | Bradford et al. | Sep 2006 | B2 |
7231343 | Treadgold et al. | Jun 2007 | B1 |
7231379 | Parikh et al. | Jun 2007 | B2 |
7260567 | Parikh et al. | Aug 2007 | B2 |
7287025 | Wen et al. | Oct 2007 | B2 |
7346490 | Fass et al. | Mar 2008 | B2 |
7366711 | McKeown et al. | Apr 2008 | B1 |
7370056 | Parikh et al. | May 2008 | B2 |
7383258 | Harik et al. | Jun 2008 | B2 |
7392174 | Freeman | Jun 2008 | B2 |
7398201 | Marchisio et al. | Jul 2008 | B2 |
7428529 | Zeng et al. | Sep 2008 | B2 |
7440941 | Borkovsky et al. | Oct 2008 | B1 |
7444325 | Khandelwal et al. | Oct 2008 | B2 |
7475063 | Datta et al. | Jan 2009 | B2 |
7487094 | Konig et al. | Feb 2009 | B1 |
7496561 | Caudill et al. | Feb 2009 | B2 |
7499934 | Zhang et al. | Mar 2009 | B2 |
7526425 | Marchisio et al. | Apr 2009 | B2 |
7536408 | Patterson | May 2009 | B2 |
7562069 | Chowdhury et al. | Jul 2009 | B1 |
7580921 | Patterson | Aug 2009 | B2 |
7580929 | Patterson | Aug 2009 | B2 |
7584175 | Patterson | Sep 2009 | B2 |
7599914 | Patterson | Oct 2009 | B2 |
7599930 | Burns et al. | Oct 2009 | B1 |
7624007 | Bennett | Nov 2009 | B2 |
7627548 | Riley et al. | Dec 2009 | B2 |
7630978 | Li et al. | Dec 2009 | B2 |
7630980 | Parikh | Dec 2009 | B2 |
7634462 | Weyand et al. | Dec 2009 | B2 |
7636714 | Lamping et al. | Dec 2009 | B1 |
20060206474 | Kapur et al. | Sep 2006 | A1 |
20070043761 | Chim et al. | Feb 2007 | A1 |
20070100823 | Inmon | May 2007 | A1 |
20070106937 | Cucerzan et al. | May 2007 | A1 |
20080091670 | Ismalon | Apr 2008 | A1 |
20080114721 | Jones et al. | May 2008 | A1 |
20080215564 | Bratseth | Sep 2008 | A1 |
20080319962 | Riezler et al. | Dec 2008 | A1 |
20090024606 | Schilit et al. | Jan 2009 | A1 |
20090055394 | Schilit et al. | Feb 2009 | A1 |
20090083028 | Davtchev et al. | Mar 2009 | A1 |
20090193008 | De et al. | Jul 2009 | A1 |
20090216737 | Dexter | Aug 2009 | A1 |
20090240685 | Costello et al. | Sep 2009 | A1 |
20090259647 | Curtis | Oct 2009 | A1 |
20090276426 | Liachenko et al. | Nov 2009 | A1 |
20090282033 | Alshawi | Nov 2009 | A1 |
20090313233 | Hanazawa | Dec 2009 | A1 |
20090327269 | Paparizos et al. | Dec 2009 | A1 |
Number | Date | Country | |
---|---|---|---|
20110276599 A1 | Nov 2011 | US |