The present invention relates to the field of search engines. More specifically, the present invention relates to topic-based language models for application search engines.
Many “app stores” currently exist such as the Apple® App Store and the Amazon® Appstore, which provide mobile device applications available for download to users' mobile devices. Additionally, there are search products which enable users to search for mobile device applications on these app stores. However, some app stores and search products have significant shortcomings such as being limited to searching by application name or application developer.
Topic-based language models for an application search engine enable a user to search for an application based on the application's function rather than title. To enable a search based on function, information is gathered and processed including application names, descriptions and external information. Processing the information includes filtering the information, generating a topic model and supplementing the topic model with additional information. The resultant topic-based language models are able to be used in an application search engine.
An implicit search engine enables searching for mobile applications (also referred to as “apps”), based on their function rather than their name. This differs substantially from web searches in that the contents of the app being searched for are typically not accessible to the search engine. Further, the contents of the app do not necessarily correlate with users' query behavior. When querying for apps, users formulate queries that identify the function of that app. However, apps are typically overwhelmingly made up of content that are instances of the function, and do not describe (or even refer to) the actual function itself. For example, a messaging app may contain many message logs that do not refer to the app's function, but users wish to search for terms related to messaging rather than to the contents of the message logs.
The improved process of generating topic model-based word probabilities is as follows. First, a corpus of metadata capturing app functionality is assembled from various data sources for each app. The content of the corpus is normalized to a canonical form, and then the topic model is trained from this corpus. The resulting topic model is then used to learn a language model (e.g., a probability distribution over words) that represents each app's name and function.
In order to ensure relevancy and coverage, post-processing is then carried out. The first step is to eliminate words not pertinent to the app, and the second step is to associate words deemed relevant from query logs.
In some embodiments, the topic-based language model generation application(s) 330 include several applications and/or modules. In some embodiments, modules include one or more sub-modules as well.
Examples of suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart phone (e.g., an iPhone® or a Droid®), a tablet (e.g., an iPad®), a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®, a video player, a DVD writer/player, a Blu-ray® writer/player, a television, a home entertainment system, Apple TV® and associated devices, a cloud-coupled device or any other suitable computing device.
To utilize the topic-based language model generation for search engines, a device automatically, manually or semi-automatically searches, processes and generates topic-based language models including searching/crawling for app information from several sources, filtering the information to relevant, functional information, building a topic model, filtering the topic model and calculating word probabilities for each app. In post-processing steps, additional words are able to be added to the topic model based on supplemental information. The resultant topic-based language model is able to be used to enable searching for apps by function.
In operation, the topic-based language model generation allows a set of information to be generated to permit searching for apps based on a function of an app rather than a title of an app which improves a user's experience of searching for mobile device apps.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiments chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.
This application is a continuation-in-part of co-pending U.S. patent application Ser. No. 13/312,126, filed Dec. 6, 2011, and titled, “SEARCH, RECOMMENDATION AND DISCOVERY INTERFACE,” which claims priority under 35 U.S.C. §119(e) of the U.S. Provisional Patent Application Ser. No. 61/421,560, filed Dec. 9, 2010, and titled, “APP SEARCH ENGINE,” both of which are also hereby incorporated by reference in their entireties for all purposes. This application also claims priority under 35 U.S.C. §119(e) of the U.S. Provisional Patent Application Ser. No. 61/473,672, filed Apr. 8, 2011 and titled, “IMPROVED GENERATION OF TOPIC-BASED LANGUAGE MODELS FOR AN APP SEARCH ENGINE,” which is also hereby incorporated by reference in its entirety for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
6064980 | Jacobi et al. | May 2000 | A |
6418431 | Mahajan | Jul 2002 | B1 |
7778890 | Bezos et al. | Aug 2010 | B1 |
8099332 | Lemay et al. | Jan 2012 | B2 |
8484636 | Mehta et al. | Jul 2013 | B2 |
8548969 | Rhinelander et al. | Oct 2013 | B2 |
8559931 | Moon et al. | Oct 2013 | B2 |
9201965 | Gannu | Dec 2015 | B1 |
20030167252 | Odom | Sep 2003 | A1 |
20040172267 | Patel et al. | Sep 2004 | A1 |
20050165753 | Chen et al. | Jul 2005 | A1 |
20060212288 | Sethy | Sep 2006 | A1 |
20080010280 | Jan | Jan 2008 | A1 |
20080243638 | Chan et al. | Oct 2008 | A1 |
20080294609 | Liu et al. | Nov 2008 | A1 |
20090063288 | Croes | Mar 2009 | A1 |
20090281851 | Newton et al. | Nov 2009 | A1 |
20090307105 | Lemay et al. | Dec 2009 | A1 |
20090327264 | Yu | Dec 2009 | A1 |
20100057577 | Stefik et al. | Mar 2010 | A1 |
20100107081 | Benenson | Apr 2010 | A1 |
20100125540 | Stefik | May 2010 | A1 |
20100175026 | Bortner et al. | Jul 2010 | A1 |
20100191619 | Dicker et al. | Jul 2010 | A1 |
20100312572 | Ramer et al. | Dec 2010 | A1 |
20110004462 | Houghton | Jan 2011 | A1 |
20110078159 | Li et al. | Mar 2011 | A1 |
20110137906 | Cai et al. | Jun 2011 | A1 |
20110201388 | Langlois et al. | Aug 2011 | A1 |
20110202484 | Anerousis | Aug 2011 | A1 |
20110288941 | Chandra et al. | Nov 2011 | A1 |
20110302162 | Xiao | Dec 2011 | A1 |
20110307354 | Erman et al. | Dec 2011 | A1 |
20120030206 | Shi et al. | Feb 2012 | A1 |
20120030623 | Hoellwarth | Feb 2012 | A1 |
20120101965 | Hennig | Apr 2012 | A1 |
20120116905 | Futty et al. | May 2012 | A1 |
20130227104 | Lee | Aug 2013 | A1 |
20140304657 | Biswas | Oct 2014 | A1 |
20150242447 | Ipeirotis | Aug 2015 | A1 |
Number | Date | Country |
---|---|---|
200643747 | Dec 2006 | TW |
200923690 | Jun 2009 | TW |
201108007 | Mar 2011 | TW |
Entry |
---|
Huang et al. “Deleted Interpolation and Density Sharing for Continuous Hidden Markov Models”, Microsoft Corporation, IEEE, 1996. |
Lane et al. “Out-of-Domain Detection Based on Confidence Measures From Multiple Topic Classification”, School of Informatics, Kyoto University, IEEE, 2004. |
International Preliminary Report on Patentability for International Application No. PCT/US2012/032589, dated Oct. 13, 2013, 2 pages. |
Written Opinion and International Search Report for International Application No. PCT/US2012/032589, dated Jun. 22, 2012, 13 pages. |
Taiwanese Office Action and Search Report in corresponding Taiwanese application 101112160 (showing “A” category of three Taiwanese references in “Foreign Patens Documents” section of this IDS). |
Number | Date | Country | |
---|---|---|---|
20120191694 A1 | Jul 2012 | US |
Number | Date | Country | |
---|---|---|---|
61421560 | Dec 2010 | US | |
61473672 | Apr 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13312126 | Dec 2011 | US |
Child | 13440896 | US |