The present disclosure relates to methods, techniques, and systems for presenting content using natural language processing and, in particular, to methods, techniques, and systems for recognizing named entities using natural language processing and presenting content related thereto.
With more than 15 billion documents on the World Wide Web (the Web) today, it has become very difficult for users to find desired information or to discover relevant information. Typically, a user engages a keyword (Boolean) based search engine to enter terms that s/he thinks relates to the topic of interest. Unfortunately, there could be hundreds of thousands of documents with similar keywords requiring readers to sort out what is relevant. Moreover, once a user has followed links (e.g., hyperlinks, hypertext, indicators, etc.) to more than a few web pages, it is highly likely that the user has navigated to a point that makes it difficult to retrace steps.
Thus, although the volume of documents on the Web potentially makes a lot more information available to the average person, it takes a fair bit of time to actually find documents that are useful.
Embodiments described herein provide enhanced computer- and network-assisted methods, techniques, and systems for using natural language processing techniques, potentially in conjunction with context or other related information, to locate and provide content related to entities that are recognized in associated material. Example embodiments provide one or more NLP-based content recommenders (“NCRs”) that each, based upon a natural language analysis of an underlying text segment, determine which entities are being referred to in the text segment and recommend additional content relating to such entities.
NCRs may be useful in environments such as to support a user browsing pages of content on the Web. One or more NCRs may be embedded as widgets on such pages to assist users in their perusal and search for information, provided by means of browser plug-ins or other application plug-ins, provided in libraries or in standalone environments, or otherwise integrated into other code, programs, or devices.
For example, when a news article is being displayed in a Web browser, an NCR may be invoked to suggest additional relevant content by recognizing the entities referred to in the article and determining relevant additional content, organized by a number of factors, for example, by frequency of appearance of other information relating to one of the recognized entities in the article, by knowledge of the browse patterns of the reader, etc. An NCR might also be invoked to allow the reader to explore the top entities “connected” to one of the entities selected from the entities recognized in the news article. Connectedness in this sense refers to entities which are related to the selected recognized entity typically through one or more actions (verbs). Or an NCR might be invoked to “filter” or otherwise rank or order the content presented to the user.
In at least some embodiments, the NCR may use context information relating to source information that was used to establish and identify the entities (e.g., verbs, related entities, entities within close proximity in the underlying text or in other text, or other clues) in the recommendations. In some embodiments, algorithms are employed for natural language-based entity recognition and disambiguation to determine which entities are present in the underlying text. For example, these algorithms may be incorporated to display an ordered list of all, or the most important, or the top “n” entities present on a Web page in conjunction with the underlying page. The items on the list can then be used to navigate to additional (related) content, for example, as “links” or other references to the content. The example NCR illustrated in
An example system that supports the generation of an ordered list of entities is described in co-pending U.S. patent application filed concurrently herewith, assigned application Ser. No. 12/288,158, and titled “NLP-Based Entity Recognition and Disambiguation,” which is incorporated by reference it its entirety.
In addition, in at least some embodiments, an NLP-Based search mechanism can be incorporated by an NCR to find related (e.g., auxiliary or supplemental) information to recommend. Contextual and other information, such as information from ontology knowledge base lookups or from other knowledge repositories may also be incorporated in establishing information to recommend. One such system and methods for generating related content using relationship searching is encompassed in the InFact® relationship search technology (now the Evri relationship search technology), described in more detail in U.S. patent application Ser. No. 11/012,089, filed Dec. 13, 2004, which published on Dec. 1, 2005 as U.S. Patent Publication No. 2005/0267871A1, and which is hereby incorporated by reference in its entirety. In this system, NLP-based processing is used to locate entities and the connections (relationships) between them based upon actions that link a source entity to a target entity, or visa versa (i.e., queries that specify a subject and/or an object, and zero or more verbs that may relate them).
In addition, the InFact®/Evri technology provides a query language called “IQL” (now “EQL”) and a navigation tip system with query templates for generating relationship queries with or without a graphical user interface. Query templates and the navigation tip system may be incorporated by other code to automatically generate generalized searches of content that utilize sophisticated linguistics and/or knowledge-based analysis. The InFact®/Evri tip system not only performs the NLP-based search, but can order the results as desired. In addition, the tip system can dynamically evolve the searches—hence the related entities—as the underlying text is changed, for example by filtering it using focus terms 102 in
In at least some embodiments, NCRs are provided by means of a user interface control displayed adjacent to, approximate to, on or near other displayed content such as illustrated in
Such widgets need not be limited to displaying related content accessible via a Web browser. Indeed, NCR widgets also may be useful in a variety of other contexts and platforms, such as to create other mechanisms for finding sought after data in large repositories of information (e.g., corporate intelligence data bases, product information, etc.), to perform research or other discovery, to provide learning tools in educational environments, to navigate newsletters and archived articles for a company, etc. NCRs are intended to aid in conveying meaningful information to end users from among a morass of data without them necessarily knowing how to search for that information. They are intended to do a better job at emulating “understanding” the underlying text than a keyword search engine would, so that users can search less and understand more, or discover more with less work.
NCR widgets present user interfaces that may vary depending upon the context in which they are integrated, their use, etc.
In at least some of the NCRs, the name of entity (e.g., Barack Obama) is provided along with an indication of the type of entity and/or its roles (e.g., categories or facets, such as senator, democrat, presidential candidate). Then, for some NCRs, a list of facts about the entity and/or an overview of further content is displayed. In at least some embodiments, an image associated with the named entity is also displayed. Importantly, if more information (as determined by the NCR) is available, then a link (also referred to as a hyperlink, hypertext, or other indicator) may be displayed. The link may be operated (e.g., selected or navigated to) by a user to navigate to recommended content. Other features, including more or different features may be provided or combined in an embodiment of an NCR as helpful in the context.
For example, as described earlier, the example NCR 101 in
The illustrated NCR 110 also includes a “Connections” section 106, which provides a graphical map of the entities related to the selected named entity 105. The entities included in the graphical map 106 may be selected by the NCR 110 as the most popular entities, the most frequently described in the top related articles, or using other rules. In one embodiment, as shown, the entities in the connections map 106 are color-coded based upon their base type: for example, whether they are persons, places, or things (which may include organizations, products, etc.). An end user may select one of the nodes 107 on the map 106, to further change the recommendations by refining what is considered “related.”
Example NCRs also may include still and or video images. By selecting link 123, the user can navigate to recommended videos that relate to the relationship between “Barack Obama” and “Ohio.” Note that these recommendations may also be ordered and/or ranked.
Note that
The powerful NLP based search processing identifies the topmost entities in the relationship displayed by the articles recommended in section 322. That is, these are the entities involved in a “governing” relationship with “Jennifer Brunner.”
As described above, the layout of an NCR tip or user interface control (UI control) may depend upon the information available. Generally, in the example illustrated in
For example, as shown in larger images in
According to one example embodiment, to populate the fields of the tip or UI control, such as action list 612 and connections list (relationships list) 613, an IQL/EQL query may be performed against the last “W” weeks of news content to return related information. In the illustrated case, “N” results are returned for actions performed by the entity, in this case United States of America, sorted by action (verb) frequency. The top “V” verbs are then displayed, as seen in action list 612. In other embodiments, actions could be derived from an NLP-based relationship extraction of the context (trigger) text or a set of documents related to the context text, or from other sources.
The progression from
When the user hovers over or otherwise selects the named entity “Kaela Kennelly” 801, a tip 850 is displayed with initial information similar to that described with reference to
As shown in larger image in
Other representations for presenting recommended content by means of an NLP-Based Content Recommenders are also contemplated. It is notable that many such representations hide the power of the underlying relationship indexing and searching technology by giving the user simple navigation tools and hints for getting more information. Moreover, the information is determined, calculated, and presented in substantially real-time or near real-time, and may be dynamically updated periodically, or at specified intervals, or according to different schedules.
An NCR widget may be implemented using standard programming techniques that leverage the capabilities of a NLP-based processing engine that can perform indexing and relationship searching. It is to be understood that, although the interfaces illustrated in
Also, although certain terms are used primarily herein, other terms could be used interchangeably to yield equivalent embodiments and examples. In addition, terms may have alternate spellings which may or may not be explicitly mentioned, and all such variations of terms are intended to be included. In addition, in the following description, numerous specific details are set forth, such as data formats and code sequences, etc., in order to provide a thorough understanding of the described techniques. The embodiments described also can be practiced without some of the specific details described herein, or with other specific details, such as changes with respect to the ordering of the code flow, different code flows, etc. Thus, the scope of the techniques and/or functions described are not limited by the particular order, selection, or decomposition of steps described with reference to any particular routine.
Computing system 1300 may comprise one or more server and/or client computing systems and may span distributed locations. In addition, each block shown may represent one or more such blocks as appropriate to a specific embodiment or may be combined with other blocks. Moreover, the various blocks of the NCR 1310 may physically reside on one or more machines, which use standard (e.g., TCP/IP) or proprietary interprocess communication mechanisms to communicate with each other.
In the embodiment shown, computer system 1300 comprises a computer memory (“memory”) 1301, a display 1302, one or more Central Processing Units (“CPU”) 1303, Input/Output devices 1304 (e.g., keyboard, mouse, CRT or LCD display, etc.), other computer-readable media 1305, and network connections 1306. The NCR 1310 is shown residing in memory 1301. In other embodiments, some portion of the contents, some of, or all of the components of the NCR 1310 may be stored on and/or transmitted over the other computer-readable media 1305. The components of the NCR 1310 preferably execute on one or more CPUs 1303 and perform entity identification and present content recommendations, as described herein. Other code or programs 1330 and potentially other data repositories, such as data repository 1320, also reside in the memory 1301, and preferably execute on one or more CPUs 1303. Of note, one or more of the components in
In one embodiment, the NCR 1310 includes an entity identification engine 1311, a knowledge analysis engine 1312, an NCR user interface support module 1313, an NLP parsing engine or preprocessor 1314, an NCR API 1317, a data repository (or interface thereto) for storing document NLP data 1316, and a knowledge data repository 1315, for example, an ontology index, for storing information from a multitude of internal and/or external sources. In at least some embodiments, one or more of the NLP parsing engine/preprocessor 1314, the entity identification engine 1311, and the knowledge analysis engine 1312 are provided external to the NCR and are available, potentially, over one or more networks 1380. Other and or different modules may be implemented. In addition, the NCR 1310 may interact via a network 1380 with applications or client code 1355 that uses results computed by the NCR 1310, one or more client computing systems 1360, and/or one or more third-party information provider systems 1365, such as purveyors of information used in knowledge data repository 1315. Also, of note, the knowledge data 1315 and the document data 1316 may be provided external to the NCR as well, for example, and be accessible over one or more networks 1380 to the NCR.
In an example embodiment, components/modules of the NCR 1310 are implemented using standard programming techniques. However, a range of programming languages known in the art may be employed for implementing such example embodiments, including representative implementations of various programming language paradigms, including but not limited to, object-oriented (e.g., Java, C++, C#, Smalltalk), functional (e.g., ML, Lisp, Scheme, etc.), procedural (e.g., C, Pascal, Ada, Modula, etc.), scripting (e.g., Perl, Ruby, Python, JavaScript, VBScript, etc.), declarative (e.g., SQL, Prolog, etc.), etc.
The embodiments described use well-known or proprietary synchronous or asynchronous client-sever computing techniques. However, the various components may be implemented using more monolithic programming techniques as well, for example, as an executable running on a single CPU computer system, or alternately decomposed using a variety of structuring techniques known in the art, including but not limited to, multiprogramming, multithreading, client-server, or peer-to-peer, running on one or more computer systems each having one or more CPUs. Some embodiments are illustrated as executing concurrently and asynchronously and communicating using message passing techniques. Equivalent synchronous embodiments are also supported by an NCR implementation.
In addition, programming interfaces to the data stored as part of the NCR 1310 (e.g., in the data repositories 1315 and 1316) can be made available by standard means such as through C, C++, C#, and Java APIs; libraries for accessing files, databases, or other data repositories; through scripting languages such as XML; or through Web servers, FTP servers, or other types of servers providing access to stored data. The data repositories 1315 and 1316 may be implemented as one or more database systems, file systems, or any other method known in the art for storing such information, or any combination of the above, including implementation using distributed computing techniques.
Also, the example NCR 1310 may be implemented in a distributed environment comprising multiple, even heterogeneous, computer systems and networks. For example, in one embodiment, the modules 1311-1314, and 1317, and the data repositories 1315 and 1316 are all located in physically different computer systems. In another embodiment, various modules of the NCR 1310 are hosted each on a separate server machine and may be remotely located from the tables which are stored in the data repositories 1315 and 1316. Also, one or more of the modules may themselves be distributed, pooled or otherwise grouped, such as for load balancing, reliability or security reasons. Different configurations and locations of programs and data are contemplated for use with techniques of described herein. A variety of distributed computing techniques are appropriate for implementing the components of the illustrated embodiments in a distributed manner including but not limited to TCP/IP sockets, RPC, RMI, HTTP, Web Services (XML-RPC, JAX-RPC, SOAP, etc.). Other variations are possible. Also, other functionality could be provided by each component/module, or existing functionality could be distributed amongst the components/modules in different ways, yet still achieve the functions of an NCR.
Furthermore, in some embodiments, some or all of the components of the NCR may be implemented or provided in other manners, such as at least partially in firmware and/or hardware, including, but not limited to, one or more application-specific integrated circuits (ASICs), standard integrated circuits, controllers (e.g., by executing appropriate instructions, and including microcontrollers and/or embedded controllers), field-programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), etc. Some or all of the system components and/or data structures may also be stored as contents (e.g., as executable or other machine-readable software instructions or structured data) on a computer-readable medium (e.g., as a hard disk; a memory; a computer network or cellular wireless network or other data transmission medium; or a portable media article to be read by an appropriate drive or via an appropriate connection, such as a DVD or flash memory device) so as to enable or configure the computer-readable medium and/or one or more associated computing systems or devices to execute or otherwise use or provide the contents to perform at least some of the described techniques. Some or all of the system components and data structures may also be transmitted as contents of generated data signals (e.g., by being encoded as part of a carrier wave or otherwise included as part of an analog or digital propagated signal) on a variety of computer-readable transmission mediums, including wireless-based and wired/cable-based mediums, and may take a variety of forms (e.g., as part of a single or multiplexed analog signal, or as multiple discrete digital packets or frames). Such computer program products may also take other forms in other embodiments. Accordingly, embodiments of the present disclosure may be practiced with other computer system configurations.
All of the above U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, including but not limited to U.S. Provisional Patent Application No. 60/999,559, entitled “NLP-BASED CONTENT RECOMMENDER,” filed Oct. 17, 2007, and U.S. application Ser. No. 12/288,347, entitled NLP-BASED CONTENT RECOMMENDER,” filed Oct. 16, 2008, are incorporated herein by reference, in their entireties.
From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of this disclosure. For example, the methods, techniques, and systems for entity recognition and disambiguation are applicable to other architectures other than a Web-based architecture. For example, other systems that are programmed to perform natural language processing can be employed. Also, the methods, techniques, and systems discussed herein are applicable to differing query languages, protocols, communication media (optical, wireless, cable, etc.) and devices (such as wireless handsets, electronic organizers, personal digital assistants, portable email machines, game machines, pagers, navigation devices such as GPS receivers, etc.).
This application is a continuation of U.S. application Ser. No. 12/288,349, entitled “NLP-BASED CONTENT RECOMMENDER,” filed Oct. 16, 2008, now U.S. Pat. No. 8,700,604, which claims the benefit of U.S. Provisional Application No. 60/999,559, entitled “NLP-BASED CONTENT RECOMMENDER,” filed Oct. 17, 2007, all of which are incorporated herein in their entireties.
Number | Name | Date | Kind |
---|---|---|---|
4839853 | Deerwester et al. | Jun 1989 | A |
5301109 | Landauer et al. | Apr 1994 | A |
5317507 | Gallant | May 1994 | A |
5325298 | Gallant | Jun 1994 | A |
5331556 | Black, Jr. et al. | Jul 1994 | A |
5377103 | Lamberti et al. | Dec 1994 | A |
5619709 | Caid et al. | Apr 1997 | A |
5634051 | Thomson | May 1997 | A |
5752022 | Chiu | May 1998 | A |
5778362 | Deerwester | Jul 1998 | A |
5794050 | Dahlgren et al. | Aug 1998 | A |
5794178 | Caid et al. | Aug 1998 | A |
5799268 | Boguraev | Aug 1998 | A |
5848417 | Shoji et al. | Dec 1998 | A |
5857179 | Vaithyanathan et al. | Jan 1999 | A |
5884302 | Ho | Mar 1999 | A |
5933822 | Braden-Harder et al. | Aug 1999 | A |
5950189 | Cohen et al. | Sep 1999 | A |
5982370 | Kamper | Nov 1999 | A |
6006221 | Liddy et al. | Dec 1999 | A |
6006225 | Bowman et al. | Dec 1999 | A |
6026388 | Liddy et al. | Feb 2000 | A |
6061675 | Wical | May 2000 | A |
6064951 | Park et al. | May 2000 | A |
6122647 | Horowitz et al. | Sep 2000 | A |
6167368 | Wacholder | Dec 2000 | A |
6178416 | Thompson et al. | Jan 2001 | B1 |
6185550 | Snow et al. | Feb 2001 | B1 |
6192360 | Dumais et al. | Feb 2001 | B1 |
6202064 | Julliard | Mar 2001 | B1 |
6219664 | Watanabe | Apr 2001 | B1 |
6246977 | Messerly et al. | Jun 2001 | B1 |
6311152 | Bai et al. | Oct 2001 | B1 |
6363373 | Steinkraus | Mar 2002 | B1 |
6405190 | Conklin | Jun 2002 | B1 |
6411962 | Kupiec | Jun 2002 | B1 |
6460029 | Fries et al. | Oct 2002 | B1 |
6484162 | Edlund et al. | Nov 2002 | B1 |
6510406 | Marchisio | Jan 2003 | B1 |
6571236 | Ruppelt | May 2003 | B1 |
6584464 | Warthen | Jun 2003 | B1 |
6601026 | Appelt et al. | Jul 2003 | B2 |
6631523 | Matthews, III | Oct 2003 | B1 |
6728707 | Wakefield et al. | Apr 2004 | B1 |
6732097 | Wakefield et al. | May 2004 | B1 |
6732098 | Wakefield et al. | May 2004 | B1 |
6738765 | Wakefield et al. | May 2004 | B1 |
6741988 | Wakefield et al. | May 2004 | B1 |
6745161 | Arnold et al. | Jun 2004 | B1 |
6757646 | Marchisio | Jun 2004 | B2 |
6859800 | Roche et al. | Feb 2005 | B1 |
6862710 | Marchisio | Mar 2005 | B1 |
6904433 | Kapitskaia et al. | Jun 2005 | B2 |
6910003 | Arnold et al. | Jun 2005 | B1 |
6996575 | Cox et al. | Feb 2006 | B2 |
7051017 | Marchisio | May 2006 | B2 |
7054854 | Hattori et al. | May 2006 | B1 |
7146416 | Yoo et al. | Dec 2006 | B1 |
7171349 | Wakefield et al. | Jan 2007 | B1 |
7283951 | Marchisio et al. | Oct 2007 | B2 |
7356778 | Hooper et al. | Apr 2008 | B2 |
7398201 | Marchisio et al. | Jul 2008 | B2 |
7403938 | Harrison et al. | Jul 2008 | B2 |
7451135 | Goldman et al. | Nov 2008 | B2 |
7526425 | Marchisio et al. | Apr 2009 | B2 |
7529756 | Haschart et al. | May 2009 | B1 |
7672833 | Blume et al. | Mar 2010 | B2 |
7752200 | Scholl et al. | Jul 2010 | B2 |
7788084 | Brun et al. | Aug 2010 | B2 |
8069160 | Rao | Nov 2011 | B2 |
8112402 | Cucerzan et al. | Feb 2012 | B2 |
8122016 | Lamba et al. | Feb 2012 | B1 |
8122026 | Laroco, Jr. et al. | Feb 2012 | B1 |
8132103 | Chowdhury et al. | Mar 2012 | B1 |
8412557 | Lloyd et al. | Apr 2013 | B1 |
8666909 | Pinckney et al. | Mar 2014 | B2 |
8700604 | Roseman | Apr 2014 | B2 |
8725739 | Liang | May 2014 | B2 |
20020007267 | Batchilo et al. | Jan 2002 | A1 |
20020010574 | Tsourikov et al. | Jan 2002 | A1 |
20020022956 | Ukrainczyk et al. | Feb 2002 | A1 |
20020022988 | Columbus et al. | Feb 2002 | A1 |
20020059161 | Li | May 2002 | A1 |
20020078041 | Wu | Jun 2002 | A1 |
20020078045 | Dutta | Jun 2002 | A1 |
20020091671 | Prokoph | Jul 2002 | A1 |
20020103789 | Turnbull et al. | Aug 2002 | A1 |
20020120651 | Pustejovsky et al. | Aug 2002 | A1 |
20020156763 | Marchisio | Oct 2002 | A1 |
20030004716 | Haigh et al. | Jan 2003 | A1 |
20030101182 | Govrin et al. | May 2003 | A1 |
20030115065 | Kakivaya et al. | Jun 2003 | A1 |
20030115191 | Copperman et al. | Jun 2003 | A1 |
20030191626 | Al-Onaizan et al. | Oct 2003 | A1 |
20030233224 | Marchisio et al. | Dec 2003 | A1 |
20040010508 | Fest et al. | Jan 2004 | A1 |
20040044669 | Brown et al. | Mar 2004 | A1 |
20040064447 | Simske et al. | Apr 2004 | A1 |
20040103090 | Dogl et al. | May 2004 | A1 |
20040125877 | Chang et al. | Jul 2004 | A1 |
20040167870 | Wakefield et al. | Aug 2004 | A1 |
20040167883 | Wakefield et al. | Aug 2004 | A1 |
20040167884 | Wakefield et al. | Aug 2004 | A1 |
20040167885 | Wakefield et al. | Aug 2004 | A1 |
20040167886 | Wakefield et al. | Aug 2004 | A1 |
20040167887 | Wakefield et al. | Aug 2004 | A1 |
20040167907 | Wakefield et al. | Aug 2004 | A1 |
20040167908 | Wakefield et al. | Aug 2004 | A1 |
20040167909 | Wakefield et al. | Aug 2004 | A1 |
20040167910 | Wakefield et al. | Aug 2004 | A1 |
20040167911 | Wakefield et al. | Aug 2004 | A1 |
20040221235 | Marchisio et al. | Nov 2004 | A1 |
20040243388 | Corman et al. | Dec 2004 | A1 |
20050027704 | Hammond et al. | Feb 2005 | A1 |
20050076365 | Popov et al. | Apr 2005 | A1 |
20050108001 | Aarskog | May 2005 | A1 |
20050108262 | Fawcett, Jr. et al. | May 2005 | A1 |
20050138018 | Sakai et al. | Jun 2005 | A1 |
20050144064 | Calabria et al. | Jun 2005 | A1 |
20050149494 | Lindh et al. | Jul 2005 | A1 |
20050177805 | Lynch et al. | Aug 2005 | A1 |
20050197828 | McConnell et al. | Sep 2005 | A1 |
20050210000 | Michard | Sep 2005 | A1 |
20050216443 | Morton et al. | Sep 2005 | A1 |
20050234879 | Zeng et al. | Oct 2005 | A1 |
20050234968 | Arumainayagam et al. | Oct 2005 | A1 |
20050262050 | Fagin et al. | Nov 2005 | A1 |
20050267871 | Marchisio et al. | Dec 2005 | A1 |
20060149734 | Egnor et al. | Jul 2006 | A1 |
20060167862 | Reisman | Jul 2006 | A1 |
20060224565 | Ashutosh et al. | Oct 2006 | A1 |
20060229889 | Hodjat | Oct 2006 | A1 |
20060271353 | Berkan et al. | Nov 2006 | A1 |
20060279799 | Goldman | Dec 2006 | A1 |
20070067285 | Blume et al. | Mar 2007 | A1 |
20070130194 | Kaiser | Jun 2007 | A1 |
20070136326 | McClement et al. | Jun 2007 | A1 |
20070143300 | Gulli et al. | Jun 2007 | A1 |
20070156669 | Marchisio et al. | Jul 2007 | A1 |
20070174258 | Jones et al. | Jul 2007 | A1 |
20070203901 | Prado | Aug 2007 | A1 |
20070209013 | Ramsey | Sep 2007 | A1 |
20070233656 | Bunescu et al. | Oct 2007 | A1 |
20070276830 | Gruhl et al. | Nov 2007 | A1 |
20070276926 | LaJoie | Nov 2007 | A1 |
20080005651 | Grefenstette et al. | Jan 2008 | A1 |
20080010270 | Gross | Jan 2008 | A1 |
20080059456 | Chowdhury et al. | Mar 2008 | A1 |
20080082578 | Hogue et al. | Apr 2008 | A1 |
20080097975 | Guay et al. | Apr 2008 | A1 |
20080097985 | Olstad et al. | Apr 2008 | A1 |
20080120129 | Seubert et al. | May 2008 | A1 |
20080208864 | Cucerzan et al. | Aug 2008 | A1 |
20080228720 | Mukherjee et al. | Sep 2008 | A1 |
20080235203 | Case et al. | Sep 2008 | A1 |
20080249986 | Clarke-Martin et al. | Oct 2008 | A1 |
20080249991 | Valz | Oct 2008 | A1 |
20080256056 | Chang et al. | Oct 2008 | A1 |
20080288456 | Omoigui | Nov 2008 | A1 |
20080303689 | Iverson | Dec 2008 | A1 |
20080306899 | Gregory et al. | Dec 2008 | A1 |
20090070325 | Gabriel et al. | Mar 2009 | A1 |
20090076886 | Dulitz et al. | Mar 2009 | A1 |
20090144609 | Liang et al. | Jun 2009 | A1 |
20090187467 | Fang et al. | Jul 2009 | A1 |
20090228439 | Manolescu et al. | Sep 2009 | A1 |
20090248678 | Okamoto | Oct 2009 | A1 |
20090319342 | Shilman et al. | Dec 2009 | A1 |
20090327223 | Chakrabarti et al. | Dec 2009 | A1 |
20100010994 | Wittig et al. | Jan 2010 | A1 |
20100023311 | Subrahmanian et al. | Jan 2010 | A1 |
20100046842 | Conwell | Feb 2010 | A1 |
20100048242 | Rhoads et al. | Feb 2010 | A1 |
20100145940 | Chen et al. | Jun 2010 | A1 |
20100250497 | Redlich et al. | Sep 2010 | A1 |
20100299301 | Busch et al. | Nov 2010 | A1 |
20100299326 | Germaise | Nov 2010 | A1 |
20100306251 | Snell | Dec 2010 | A1 |
20110112995 | Chang et al. | May 2011 | A1 |
20110173194 | Sloo et al. | Jul 2011 | A1 |
20120254188 | Koperski et al. | Oct 2012 | A1 |
20130124510 | Guha | May 2013 | A1 |
Number | Date | Country |
---|---|---|
0 280 866 | Sep 1988 | EP |
0 597 630 | May 1994 | EP |
20080111822 | Dec 2008 | KR |
0014651 | Mar 2000 | WO |
0057302 | Sep 2000 | WO |
0122280 | Mar 2001 | WO |
0180177 | Oct 2001 | WO |
0227536 | Apr 2002 | WO |
0233583 | Apr 2002 | WO |
03017143 | Feb 2003 | WO |
2004053645 | Jun 2004 | WO |
2004114163 | Dec 2004 | WO |
2006068872 | Jun 2006 | WO |
Entry |
---|
Molla, Diego, “AFNER—Named Entity Recognition,” Macquarie University, Australia, 2008, 2 pages. |
Nadeau, David, et al. “A survey of named entity recognition and classification,” National Research Council Canada / New York University, Lingvisticae Investigationes, vol. 30, Jan. 1, 2007, 20 pages. |
Abraham, “FoXQ—Xquery by Forms,” Human Centric Computing Languages and Environments, IEEE Symposium, Oct. 28-31, 2003, Piscataway, New Jersey, pp. 289-290. |
Cass, “A Fountain of Knowledge,” IEEE Spectrum Online, URL: http://www.spectrum.ieee.org/WEBONLY/publicfeature/jan04/0104comp1.html, download date Feb. 4, 2004, 8 pages. |
Feldman et al., “Text Mining at the Term Level,” Proc. of the 2nd European Symposium on Principles of Data Mining and Knoweldge Discovery, Nantes, France, 1998, pp. 1-9. |
Ilyas et al., “A Conceptual Architecture for Semantic Search Engine,” IEEE, INMIC, 2004, pp. 605-610. |
Jayapandian et al., “Automating the Design and Construction of Query Forms,” Data Engineering, Proceedings of the 22nd International Conference IEEE, Atlanta, Georgia, Apr. 3, 2006, pp. 125-127. |
Kaiser, “Ginseng—A Natural Language User Interface for Semantic Web Search,” University of Zurich, Sep. 16, 2004, URL=http://www.ifi.unizh.ch/archive/mastertheses/DA—Arbeiten—2004/Kaiser—Christian.pdf, pp. 1-84. |
Liang et al., “Extracting Statistical Data Frames from Text,” SIGKDD Explorations 7(1):67-75, Jun. 2005. |
Littman et al., “Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing,” In Grefenstette, G. editor, Cross Language Information Retrieval. Kluwer, 1998, pp. 1-11. |
Nagao et al., “Semantic Annotation and Transcoding: Making Web Content More Accessible,” IEEE Multimedia, IEEE Computer Society, US, 8(2):69-81, Apr. 2001. |
Nguyen et al., “Accessing Relational Databases from the World Wide Web,” SIGMOD Record ACM USA, Jun. 1996, vol. 25, No. 2, pp. 529-540. |
Pohlmann et al., “The Effect of Syntactic Phrase Indexing on Retrieval Performance for Dutch Texts,” Proceedings of RIAO, pp. 176-187, Jun. 1997. |
Rasmussen, “WDB—A Web Interface to Sybase,” Astronomical Society of the Pacific Conference Series, Astron. Soc. Pacific USA, 1995, vol. 77, pp. 72-75, 1995. |
Sneiders, “Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database,” Natural Language Processing and Information Systems, 6th International Conference on Applications of Natural Language to Information Systems, Revised Papers (Lecture Notes in Computer Science vol. 2553) Springer-Verlag Berlin, Germany, 2002 vol. 2553, pp. 235-239. |
Ruiz-Casado et al., “From Wikipedia to Semantic Relationships: a Semi-Automated Annotation Approach” 2006, pp. 1-14. |
Florian et al., “Named Entity Recognition through Classifier Combination”, 2003, IBM T.J. Watson Research Center, pp. 168-171. |
Wu et al., “A Stacked, Voted, Stacked Model for Named Entity Recognition”, 2003, pp. 1-4. |
Google “How to Interpret your Search Results”, http://web.archive.org/web/2011116075703/http://www.google.com/intl/en/help/interpret/htht, Mar. 27, 2001, 6 pages. |
Razvan Bunescu et al., “Using Encyclopedia Knowledge for Named Entity Disambiguation” 2006, Google, pp. 9-16. |
Silviu Cucerzan, “Large-Scale Named Entity Disambiguation Based on Wikipedia Data”, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Jun. 2007, pp. 708-716. |
Bunescu “Learning for Information Extraction: From Named Entity Recognition and Disambiguation to Relation Extraction”, The Dissertation Committee for Aug. 2007, The University of Texas at Austin, pp. 1-150. |
Hassell et al., “Ontology-Driven Automatic Entity Disambiguation in Unstructured Text”, Large Scale Distributed Information Systems (LSDIS) Lab Computer Science Department, University of Georgia, Athens, GA 30602-7404, ISWC, 2006, LNCS 4273, pp. 44-57. |
Lloyd et al.,“Disambiguation of References to Individuals”, IBM Research Report, Oct. 28, 2005, pp. 1-9. |
Dhillon et al., “Refining Clusters in High Dimensional Text Data,” 2002. |
Perone, Christian, Machine Learning:: Cosine Similarity for Vector Space Models (Part III), Pyevolve.sourceforge.net/wordpress/?p=2497, Sep. 12, 2013. |
Rao et al., “Entity Linking: Finding Extracted Entities in a Knowledge Base,” Jul. 12, 2012, Multi-source, Multilingual Information Extraction and Summarization Theory and Applications of Natural Language Processing, 2013, 21 pages. |
Number | Date | Country | |
---|---|---|---|
20140229467 A1 | Aug 2014 | US |
Number | Date | Country | |
---|---|---|---|
60999559 | Oct 2007 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12288349 | Oct 2008 | US |
Child | 14181591 | US |