This document relates to harvesting data from a page.
Current approaches for harvesting content from web pages face the situation that each site and page can have a unique layout comprised of multiple components in various places, such as content sections, ads, frames, columns, content boxes, page content that is divided into sub-divisions or sub-objects, web page sub-components, content articles that run or continue across several sections or pages, etc. In response to this situation, present tools for crawling and mining content from such pages have sometimes needed to be specifically programmed for the unique layout and structure of each site and/or page so that they know where the content of interest is located in the layout for that site. In other words, the programming tells the tool what parts of the page represent content to keep, and content to discard, and what sections of the page correspond to various types of content. For example, when mining a news site, it may be desired to collect and operate on the body text of the news articles in the site, but to ignore the ads and other sidebar content, etc. Existing approaches in this regard have involved programming specific mining agents, or programming a general agent with specific rules or templates. Moreover, such programming often need to be kept up to date for each site and/or page as the content and structure of the sites and pages change over time. This can be a labor intensive process that does not scale well to mining very large numbers of sites/pages with differing layouts and structure. Another existing solution uses statistical or natural language processing methods, or machine learning methods, to try to figure out automatically which parts of sites and pages should be kept and which should be discarded, and which parts of sites and pages correspond to various types of content.
The invention relates to harvesting data from a page.
Some implementations of the present invention relate to automatically determining which parts of a website and/or web page comprise the “important” or “targeted” text to mine.
In a first aspect, a computer-implemented method for obtaining data from a page includes initiating a harvesting process for a page available in a computer system. The method includes identifying a feed representation that has been created for the page. The method includes retrieving and storing, as part of the harvesting process, at least a portion from the page based on information in the identified feed representation.
Implementations can include any, all or none of the following features. The method can further include identifying, before retrieving the portion of the page, at least a part of the feed representation to be used for retrieving the portion from the page, the part of the feed representation including the information. The method can further include using the identified part of the feed representation to identify the portion of the page as matching the information in the feed representation. Using the identified part of the feed representation can include comparing the identified part of the feed representation with contents of the page. The feed representation can include at least excerpts of content from the page. The feed representation can include at least one representation selected from: an RSS feed, an Atom feed, an XML feed, an RDF feed, a serialized data feed representation, and combinations thereof. The method can further include identifying at least one feed entry in the feed representation, the feed entry relating to a portion of the page that links to another page; determining a URL of the other page; retrieving page content from the other page using the determined URL; and identifying, in the retrieved page content, the portion of the page as matching content from the identified feed entry. Retrieving the portion from the page based on the information can include using a text recognition technique. The method can further include determining, using the identified feed representation, which content from the page not to retrieve in the harvesting process; and identifying the portion from the page as not being included in the determined content of the page not to retrieve. The method can further include causing the determined content of the page not to be retrieved in the harvesting process.
In a second aspect, a computer program product is tangibly embodied in a computer-readable medium and includes instructions that when executed by a processor perform a method for obtaining data from a page. The method includes initiating a harvesting process for a page available in a computer system. The method includes identifying a feed representation that has been created for the page. The method includes retrieving and storing, as part of the harvesting process, at least a portion from the page based on information in the identified feed representation.
In a third aspect, a computer-implemented method for obtaining data from a page includes identifying a page as a target for content retrieval, the page including multiple content portions. The method includes identifying a feed representation that has been created for the identified page, the identified feed representation including multiple feed entries each corresponding to at least some of the multiple content portions. The method includes processing each of the multiple feed entries by: accessing the identified page; identifying any of the multiple content portions that match contents of the feed entry being processed; and retrieving at least one of the multiple content portions based on the identified content portion. The method includes storing, as a result of the content retrieval, each retrieved content portion obtained from the processing of the multiple feed entries.
Implementations can include any, all or none of the following features. At least a first content portion of the multiple content portions can link to another page, and processing a first feed entry of the multiple feed entries relating to the first content portion can include: accessing the other page to which the first content portion links; identifying contents of the accessed other page that match contents of the first feed entry; and retrieving the identified contents of the accessed other page. The feed representation can include at least excerpts of content from the identified page. The feed representation can include at least one representation selected from: an RSS feed, an Atom feed, an XML feed, an RDF feed, a serialized data feed representation, and combinations thereof. Identifying contents of the accessed page that match contents of the feed entry being processed can include using a text recognition technique. The method can further include determining, using the identified feed representation, which content from the identified page not to retrieve in the content retrieval; and identifying the at least one of the multiple content portion as not being included in the determined content of the page not to retrieve. The method can further include causing any of the multiple content portions identified as matching contents of the feed entry being processed not to be retrieved in the content retrieval.
In a fourth aspect, a computer program product is tangibly embodied in a computer-readable medium and includes instructions that when executed by a processor perform a method for obtaining data from a page. The method includes identifying a page as a target for content retrieval, the page including multiple content portions. The method includes identifying a feed representation that that has been created for the identified page, the identified feed representation including multiple feed entries each corresponding to at least some of the multiple content portions. The method includes processing each of the multiple feed entries by: accessing the identified page; identifying any of the multiple content portions that match contents of the feed entry being processed; and retrieving at least one of the multiple content portions based on the identified content portion. The method includes storing, as a result of the content retrieval, each retrieved content portion obtained from the processing of the multiple feed entries.
Advantages of some implementations include: automatically targeting the harvesting activity to the desired content within sites, pages and/or parts of sites or pages; and providing a less labor intensive or computationally intensive approach to mining and harvesting the content of sites and pages with varying structure and content.
Like reference symbols in the various drawings indicate like elements.
There will now be described an exemplary implementation that relates to automatically determining which parts of a website and/or web page comprise the “important” body text to mine. The description makes reference to
The web page 100 can make selected portions of its content available in a feed representation. For example, the feed representation of a site or page can contain blurbs comprising excerpts (or in some cases, full-text) of the content of the site. A news site's feed, for example, can provide blurbs or full text for each news article in the site or site section it represents. Thus, the feed representation contains some or all of the content that, according to the publisher of the page 100 or another creator of the feed representation, is considered to be “most important” or the “main content” of the page (for example, news articles), compared to other content that is “less important” or “peripheral content” (for example, ads or sections of the page containing comments and/or annotations to the page etc.) that the page may contain, when judged against a standard of relevance. Such a feed representation can be provided using an RSS feed or Atom feed, or other extensible markup language (XML) or rich data format (RDF) feed formats, to name a few examples, or any other type of serialized “data feed” representation of the content. When mentioned herein, RSS refers to protocols or technologies including, but not limited to: RDF Site Summary (sometimes referred to as RSS 0.9, RSS 1.0); Rich Site Summary (sometimes referred to as RSS 0.91, RSS 1.0); and Really Simple Syndication (sometimes referred to as RSS 2.0). Here, the web page 100 includes a link 102 that provides a feed representation of the page 100. The information in such a feed can be used to guide and target mining efforts aimed at page 100.
For example, the feed for the page 100 may have an entry that contains the following excerpt:
In one implementation, content is harvested as follows. First, the URL to which the feed entry (Table 1) points is determined. In this example, that URL is http://novaspivack.typepad.com/nova_spivacks_weblog/2006/08/what_am-i_upjo.html. Next, the content from that URL is automatically retrieved. Next, it is determined which part of this page is the “main content” the content that should be mined. This determination is made by identifying the part of the page that matches the text from the feed entry. In this example, that is the part of the page starting with: “What is Radar Networks up to? Shel Israel and I just finished up working together for 10 days. I needed Shel's perspective on what we are working on at Radar Networks.” The matching part can be identified using any technique for comparing content, such as text recognition.
In other words, an implementation of a mining agent can automatically look for the part of the page that matches this text and identify it as the part that matters. Rather than the mining agent having to somehow parse or analyze the page to determine which content it should mine, the determination of which part(s) of the page should be mined is made by the content provider, the publisher of the feed, when they decide which content to publish in their feed—the agent simply mines the content portions that are referenced from the feed. In this example, the identified portion is the part of the page that is mined; the rest is ignored. In other implementations, more contents can be mined. Thus, without the agent being specifically programmed for the layout of this particular weblog, the agent can determine where in the layout to find the “important” content that it is supposed to mine.
Such an agent (or methods performed in harvesting operations), can be configured and used with an aim toward harvesting only the main content of a specific site or page, and not the advertisements or other peripheral content. It could also be used conversely to figure out which content to ignore—for example, if the goal is to filter out the main content, so that the peripheral content can be mined, in which case the agent mines everything except what is provided in the feed for a given page.
The first device can make one or more pages available, for example to the second device and/or other entities on the network 206. In this example, only a few pages 208 and 210 are shown for clarity, but other implementations can make available dozens, hundreds or millions of pages or more.
Here, the page 208 contains links to the other pages 210. That is, the page 208 includes a portion 208A linking to the page 210A; a portion 208B linking to the page 210B; and a portion 208C linking to the page 210C. Users visiting the page 208 can access any of the pages 210 by activating the corresponding link(s) on the page 208. As other examples, a user can access any of the pages 210 directly by entering its address into a browser or other client program, or by navigating to the page using a link on another page (not shown). For example, the page 208 in some implementations can include the web page 100 (
In this example, the pages 210 are shown as residing within the same device as the pages 210 (i.e., the first device 202). In other implementations, one or more of the pages 210 can be located on another device or in another system. That is, the pages 210 need not have been published by the same entity as the page 208, or be controlled by that entity.
Here, the first device also makes available a feed representation 212 associated with the page 208. The feed representation can have any of a number of types. In some implementations, the feed representation 212 can be an RSS feed, an Atom feed, an XML feed, an RDF feed, a serialized data feed representation, and combinations thereof. In short, the publisher of the page 208 can provide the feed representation 212 to complement the provision of information through the page 208, for example to highlight selected portions of that page's content. For example, the feed representation 212 in some implementations can include the feed representation for the web page 100 available through the link 102 (
The feed representation 212 includes at least one or more excerpts of content from the page 208. Here, the feed representation includes a part 212A associated with the portion 218A; a part 212B associated with the portion 218B; and a part 212C associated with the portion 218C. For example, a feed entry in any of the parts 212A-C can include the contents shown in the exemplary Table 1 above.
The second device 204 can be configured for seeking out information available through the network 206 that is of interest according to one or more relevance standards. The second device can also retrieve identified contents and store them temporarily or indefinitely for one or more purposes, such as to perform additional processing on the information, or to forward it to another entity (not shown), to name just a few examples.
Here, the second device includes a content harvester 214 that is configured to perform such content retrieval. For example, the content harvester 214 can initiate a harvesting process for a page available in the computer system 206. The content harvester 214 can provide for storing and/or other processing of the retrieved content.
Here, the second device also includes an information identifier 216 that can identify the information or other content to be retrieved, and pass this information on to the content harvester 214. As one example, the information identifier 216 can identify a feed representation that has been created for the page. As another example, the information identifier 216 can identify at least a part of the feed representation to be used for retrieving a portion from the page. The content harvester 214 can retrieve and store at least a portion from the page based on information in the identified feed representation.
Assume, for example, that a feed entry in the past 212A of the feed representation 212 includes the contents in Table 1 above. The information identifier 216 can use the identified part 212A to identify the portion 208A as matching the information in the feed representation. The information identifier 216 can use any of several techniques in this operation. For example, the information identifier 216 can compare the identified part 212A with contents of the page to identify the portion 208A. As another example, retrieving the portion from the page based on the information can include using a text recognition technique, such as by parsing text in the feed entry and in the page content, and using the text recognition to match the feed entry with the page portion.
With the information identifier 216 having identified, say, the portion 208A as corresponding to an entry in the part 212A of the feed representation 212, the content harvester 214 can in some implementations retrieve at least that portion 208A from the page 208.
In contrast, when the feed entries correspond to a page that links to content on one or more pages, for example like the page 208, the information identifier 216 can determine a page identifier, such as a uniform resource locator (URL) of the other page to which the page links. The content harvester 214 can then retrieve page content from the other page using the determined URL. The information identifier 216 can identify, in the retrieved page content, the portion of the page as matching content from an identified feed entry. Based on the identification, the matching page portion is retrieved in the harvesting process.
The above examples focus on retrieving some or all of the content that has been chosen for inclusion in a feed representation. Other approaches can be used. For example, the feed representation can be used for retrieval of content that has not been chosen for inclusion in a feed representation. This can provide the advantage of helping to avoid information that qualifies as the main or central content of a page according to a relevance standard.
In some implementations, the information identifier 216 can be configured to ignore or omit contents that have entries in the feed representation 212 when identifying portions of the page 208. That is, the information identifier 216 can use the feed representation 212 to determine which content from the page 208 that is not to be retrieved in the harvesting process. For example, the information identifier 216 can find one or more portions of the page 208 that is not included in the feed representation, and also identify such content as being a candidate for retrieval. The content harvester 214 can be configured so that the determined content of the page—i.e., contents that have entries in the feed representation is not retrieved in the harvesting process.
As shown, the method 300 includes a step 310 of initiating a harvesting process for a page available in a computer system. For example, this can involve initiating or launching any or all of the second device 204, the content harvester 214 or the information identifier 216. The harvesting process can be directed to the page 208 and/or to any or all of the pages 210.
The method 300 includes a step 320 of identifying a feed representation that has been created for the page. For example, the information identifier 216 can identify the feed representation 212 as having been created for the page 208.
The method 300 includes a step 330 of retrieving and storing, as part of the harvesting process, at least a portion from the page based on information in the identified feed representation. For example, the content harvester 214 can retrieve any or all of the portions 208A-C from the page 208, and/or any or all contents from any of the pages 210 and store it in the second device 204.
One or more other steps can be performed before, in between, and/or after the steps of the method 300. For example, the second device 204 or another device can be configured to process retrieved content.
As another example, the following is an outline description of an implementation of a harvesting method.
1. Get the RSS or Atom feed for site or page x
A related example will now be described with reference to
As shown, the method 400 includes a step 410 of identifying a page as a target for content retrieval, the page including multiple content portions.
The method 400 includes a step 420 of identifying a feed representation that has been created for the identified page, the identified feed representation including multiple feed entries each corresponding to at least some of the multiple content portions.
The method 400 includes looped steps 430-470 of processing each of the multiple feed entries.
The step 440 in the loop includes accessing the identified page.
The step 450 in the loop includes identifying any of the multiple content portions that match contents of the feed entry being processed.
The step 460 in the loop includes retrieving at least one of the multiple content portions based on the identified content portion.
The step 470 in the loop indicates that the loop can be performed for each feed entry.
The method 400 includes a step 480 of storing, as a result of the content retrieval, each retrieved content portion obtained from the processing of the multiple feed entries.
One or more additional steps can be performed with the method 400, for example as described above with reference to method 300.
When one or more content portion of the page links to another page, the processing of a feed entry can include: accessing the other page to which the first content portion links; identifying contents of the accessed other page that match contents of the first feed entry; and retrieving the identified contents of the accessed other page.
Other approaches in line with one or more aspects of this description can be used.
The memory 1120 stores information within the system 1100. In one implementation, the memory 1120 is a computer-readable medium. In one implementation, the memory 1120 is a volatile memory unit. In another implementation, the memory 1120 is a non-volatile memory unit.
The storage device 1130 is capable of providing mass storage for the system 1100. In one implementation, the storage device 1130 is a computer-readable medium. In various different implementations, the storage device 1130 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device.
The input/output device 1140 provides input/output operations for the system 1100. In one implementation, the input/output device 1140 includes a keyboard and/or pointing device. In another implementation, the input/output device 1140 includes a display unit for displaying graphical user interfaces.
The features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device or in a propagated signal, for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output. The described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
To provide for interaction with a user, the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include, e.g., a LAN, a WAN, and the computers and networks forming the Internet.
The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network, such as the described one. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of this description.
A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of this disclosure. Accordingly, other embodiments are within the scope of the following claims.
This patent application claims priority from provisional patent application Ser. No. 60/821,891, filed Aug. 9, 2006 and entitled “HARVESTING DATA FROM PAGE”, the entire contents of which are incorporated herein by reference.
| Number | Name | Date | Kind |
|---|---|---|---|
| 5408657 | Bigelow et al. | Apr 1995 | A |
| 5515532 | Iijima et al. | May 1996 | A |
| 5548749 | Kroenke et al. | Aug 1996 | A |
| 5717924 | Kawai | Feb 1998 | A |
| 5809297 | Kroenke et al. | Sep 1998 | A |
| 5819086 | Kroenke | Oct 1998 | A |
| 5905498 | Diament | May 1999 | A |
| 5925100 | Drewry et al. | Jul 1999 | A |
| 5966686 | Heidorn et al. | Oct 1999 | A |
| 6173287 | Eberman et al. | Jan 2001 | B1 |
| 6311194 | Sheth et al. | Oct 2001 | B1 |
| 6363377 | Kravets et al. | Mar 2002 | B1 |
| 6411952 | Bharat et al. | Jun 2002 | B1 |
| 6499021 | Abu-Hakima et al. | Dec 2002 | B1 |
| 6513059 | Gupta et al. | Jan 2003 | B1 |
| 6516315 | Gupta | Feb 2003 | B1 |
| 6530083 | Liebenow | Mar 2003 | B1 |
| 6643650 | Slaughter et al. | Nov 2003 | B1 |
| 6654741 | Cohen et al. | Nov 2003 | B1 |
| 6704729 | Klein et al. | Mar 2004 | B1 |
| 6711585 | Copperman et al. | Mar 2004 | B1 |
| 6741744 | Hsu | May 2004 | B1 |
| 6748441 | Gemmell | Jun 2004 | B1 |
| 6789077 | Slaughter et al. | Sep 2004 | B1 |
| 6816850 | Culliss | Nov 2004 | B2 |
| 6816857 | Weissman et al. | Nov 2004 | B1 |
| 6839701 | Baer et al. | Jan 2005 | B1 |
| 6847974 | Wachtel | Jan 2005 | B2 |
| 6859807 | Knight et al. | Feb 2005 | B1 |
| 6868447 | Slaughter et al. | Mar 2005 | B1 |
| 6996566 | George et al. | Feb 2006 | B1 |
| 7072883 | Potok et al. | Jul 2006 | B2 |
| 7092928 | Elad et al. | Aug 2006 | B1 |
| 7093200 | Schreiber et al. | Aug 2006 | B2 |
| 7177798 | Hsu et al. | Feb 2007 | B2 |
| 7185075 | Mishra et al. | Feb 2007 | B1 |
| 7197451 | Carter et al. | Mar 2007 | B1 |
| 7200862 | Murching et al. | Apr 2007 | B2 |
| 7216002 | Anderson | May 2007 | B1 |
| 7246164 | Lehmann et al. | Jul 2007 | B2 |
| 7260573 | Jeh et al. | Aug 2007 | B1 |
| 7284196 | Skeen et al. | Oct 2007 | B2 |
| 7343365 | Farnham et al. | Mar 2008 | B2 |
| 7398261 | Spivack et al. | Jul 2008 | B2 |
| 7433876 | Spivack et al. | Oct 2008 | B2 |
| 7516401 | Chen et al. | Apr 2009 | B2 |
| 7536323 | Hsieh | May 2009 | B2 |
| 7584208 | Spivack et al. | Sep 2009 | B2 |
| 7640267 | Spivack et al. | Dec 2009 | B2 |
| 7707161 | Hall et al. | Apr 2010 | B2 |
| 7730094 | Kaler et al. | Jun 2010 | B2 |
| 7739121 | Jain et al. | Jun 2010 | B2 |
| 7769742 | Brawer et al. | Aug 2010 | B1 |
| 7774380 | Burke et al. | Aug 2010 | B2 |
| 7793209 | Kikuchi | Sep 2010 | B2 |
| 7814111 | Levin | Oct 2010 | B2 |
| 7895235 | Baeza-Yates et al. | Feb 2011 | B2 |
| 7933914 | Ramsey et al. | Apr 2011 | B2 |
| 7937582 | Lee | May 2011 | B1 |
| 7966564 | Catlin et al. | Jun 2011 | B2 |
| 8020206 | Hubbard et al. | Sep 2011 | B2 |
| 8046227 | Starkie | Oct 2011 | B2 |
| 8103676 | Bedrax-Weiss et al. | Jan 2012 | B2 |
| 8135704 | Hyder et al. | Mar 2012 | B2 |
| 8135831 | Sinclair et al. | Mar 2012 | B2 |
| 8150859 | Vadlamani et al. | Apr 2012 | B2 |
| 8161066 | Spivack et al. | Apr 2012 | B2 |
| 8166010 | Ives | Apr 2012 | B2 |
| 8176079 | Spertus | May 2012 | B1 |
| 8190684 | Spivack et al. | May 2012 | B2 |
| 8200617 | Spivack et al. | Jun 2012 | B2 |
| 8275796 | Spivack et al. | Sep 2012 | B2 |
| 8438124 | Spivack et al. | May 2013 | B2 |
| 8688742 | Fischer et al. | Apr 2014 | B2 |
| 20010049700 | Ichikura | Dec 2001 | A1 |
| 20020023122 | Polizzi et al. | Feb 2002 | A1 |
| 20020049689 | Venkatram | Apr 2002 | A1 |
| 20020052894 | Bourdoncle et al. | May 2002 | A1 |
| 20020055936 | Cheng et al. | May 2002 | A1 |
| 20020059223 | Nash et al. | May 2002 | A1 |
| 20020069100 | Arberman | Jun 2002 | A1 |
| 20020077803 | Kudoh et al. | Jun 2002 | A1 |
| 20020082900 | Johnson | Jun 2002 | A1 |
| 20020103777 | Zhang | Aug 2002 | A1 |
| 20020103920 | Berkun et al. | Aug 2002 | A1 |
| 20020147748 | Huang et al. | Oct 2002 | A1 |
| 20020161626 | Plante et al. | Oct 2002 | A1 |
| 20020184111 | Swanson | Dec 2002 | A1 |
| 20020194154 | Levy et al. | Dec 2002 | A1 |
| 20020194201 | Wilbanks et al. | Dec 2002 | A1 |
| 20030028871 | Wang et al. | Feb 2003 | A1 |
| 20030046344 | Kumhyr et al. | Mar 2003 | A1 |
| 20030074356 | Kaler et al. | Apr 2003 | A1 |
| 20030093551 | Taylor et al. | May 2003 | A1 |
| 20030120730 | Kuno et al. | Jun 2003 | A1 |
| 20030126136 | Omoigui | Jul 2003 | A1 |
| 20030133556 | Naik et al. | Jul 2003 | A1 |
| 20030144892 | Cowan et al. | Jul 2003 | A1 |
| 20030144988 | Nareddy et al. | Jul 2003 | A1 |
| 20030149934 | Worden | Aug 2003 | A1 |
| 20030163513 | Schaeck et al. | Aug 2003 | A1 |
| 20030208472 | Pham | Nov 2003 | A1 |
| 20040012773 | Puttkammer | Jan 2004 | A1 |
| 20040054671 | Cohen et al. | Mar 2004 | A1 |
| 20040073430 | Desai et al. | Apr 2004 | A1 |
| 20040083199 | Govindugari et al. | Apr 2004 | A1 |
| 20040083211 | Bradford | Apr 2004 | A1 |
| 20040088325 | Elder et al. | May 2004 | A1 |
| 20040111386 | Goldberg et al. | Jun 2004 | A1 |
| 20040158455 | Spivack et al. | Aug 2004 | A1 |
| 20040162773 | Del Rey et al. | Aug 2004 | A1 |
| 20040181525 | Itzhak et al. | Sep 2004 | A1 |
| 20040181604 | Immonen | Sep 2004 | A1 |
| 20040194181 | Iwaki | Sep 2004 | P1 |
| 20040210602 | Hillis et al. | Oct 2004 | A1 |
| 20040220893 | Spivack et al. | Nov 2004 | A1 |
| 20040230572 | Omoigui | Nov 2004 | A1 |
| 20040230676 | Spivack et al. | Nov 2004 | A1 |
| 20040249795 | Brockway et al. | Dec 2004 | A1 |
| 20040260680 | Best et al. | Dec 2004 | A1 |
| 20040260701 | Lehikoinen et al. | Dec 2004 | A1 |
| 20050015357 | Shahidi | Jan 2005 | A1 |
| 20050021862 | Schroeder et al. | Jan 2005 | A1 |
| 20050027708 | Mueller et al. | Feb 2005 | A1 |
| 20050055644 | Stockton | Mar 2005 | A1 |
| 20050080775 | Colledge et al. | Apr 2005 | A1 |
| 20050086206 | Balasubramanian et al. | Apr 2005 | A1 |
| 20050114487 | Peng et al. | May 2005 | A1 |
| 20050131778 | Bennett et al. | Jun 2005 | A1 |
| 20050144158 | Capper et al. | Jun 2005 | A1 |
| 20050144162 | Liang | Jun 2005 | A1 |
| 20050149510 | Shafrir | Jul 2005 | A1 |
| 20050154746 | Liu et al. | Jul 2005 | A1 |
| 20050160065 | Seeman | Jul 2005 | A1 |
| 20050165743 | Bharat et al. | Jul 2005 | A1 |
| 20050210000 | Michard | Sep 2005 | A1 |
| 20050267872 | Galai et al. | Dec 2005 | A1 |
| 20050278309 | Evans et al. | Dec 2005 | A1 |
| 20050278390 | Kaler et al. | Dec 2005 | A1 |
| 20060004703 | Spivack et al. | Jan 2006 | A1 |
| 20060004732 | Odom | Jan 2006 | A1 |
| 20060004892 | Lunt et al. | Jan 2006 | A1 |
| 20060020596 | Liu et al. | Jan 2006 | A1 |
| 20060026147 | Cone et al. | Feb 2006 | A1 |
| 20060074726 | Forbes et al. | Apr 2006 | A1 |
| 20060085788 | Amir et al. | Apr 2006 | A1 |
| 20060151507 | Swartz et al. | Jul 2006 | A1 |
| 20060168510 | Bryar et al. | Jul 2006 | A1 |
| 20060184617 | Nicholas et al. | Aug 2006 | A1 |
| 20060200434 | Flinn et al. | Sep 2006 | A1 |
| 20060200478 | Pasztor et al. | Sep 2006 | A1 |
| 20060213976 | Inakoshi et al. | Sep 2006 | A1 |
| 20060230011 | Tuttle et al. | Oct 2006 | A1 |
| 20060235873 | Thomas | Oct 2006 | A1 |
| 20060242013 | Agarwal et al. | Oct 2006 | A1 |
| 20060242574 | Richardson et al. | Oct 2006 | A1 |
| 20060248045 | Toledano et al. | Nov 2006 | A1 |
| 20060259357 | Chiu | Nov 2006 | A1 |
| 20060287989 | Glance | Dec 2006 | A1 |
| 20070016771 | Allison et al. | Jan 2007 | A1 |
| 20070027865 | Bartz et al. | Feb 2007 | A1 |
| 20070038610 | Omoigui | Feb 2007 | A1 |
| 20070038643 | Epstein | Feb 2007 | A1 |
| 20070050338 | Strohm et al. | Mar 2007 | A1 |
| 20070061198 | Ramer et al. | Mar 2007 | A1 |
| 20070081197 | Omoigui | Apr 2007 | A1 |
| 20070118802 | Gerace et al. | May 2007 | A1 |
| 20070124202 | Simons | May 2007 | A1 |
| 20070143502 | Garcia-Martin et al. | Jun 2007 | A1 |
| 20070174270 | Goodwin et al. | Jul 2007 | A1 |
| 20070179954 | Kudoh et al. | Aug 2007 | A1 |
| 20070208703 | Shi et al. | Sep 2007 | A1 |
| 20070208714 | Ture et al. | Sep 2007 | A1 |
| 20070220893 | Woltmann et al. | Sep 2007 | A1 |
| 20070260598 | Odom | Nov 2007 | A1 |
| 20080010291 | Poola et al. | Jan 2008 | A1 |
| 20080010292 | Poola | Jan 2008 | A1 |
| 20080021924 | Hall et al. | Jan 2008 | A1 |
| 20080034058 | Korman et al. | Feb 2008 | A1 |
| 20080059519 | Grifftih | Mar 2008 | A1 |
| 20080091656 | Charnock et al. | Apr 2008 | A1 |
| 20080109212 | Witbrock et al. | May 2008 | A1 |
| 20080148193 | Moetteli | Jun 2008 | A1 |
| 20080235383 | Schneider | Sep 2008 | A1 |
| 20080243838 | Scott et al. | Oct 2008 | A1 |
| 20080262964 | Bezos et al. | Oct 2008 | A1 |
| 20080270428 | McNamara et al. | Oct 2008 | A1 |
| 20080306959 | Spivack et al. | Dec 2008 | A1 |
| 20090030982 | Spivack et al. | Jan 2009 | A1 |
| 20090076887 | Spivack et al. | Mar 2009 | A1 |
| 20090077062 | Spivack et al. | Mar 2009 | A1 |
| 20090077094 | Bodain | Mar 2009 | A1 |
| 20090077124 | Spivack et al. | Mar 2009 | A1 |
| 20090077531 | Miloslavsky et al. | Mar 2009 | A1 |
| 20090089278 | Poola et al. | Apr 2009 | A1 |
| 20090089286 | Kumar et al. | Apr 2009 | A1 |
| 20090106307 | Spivack | Apr 2009 | A1 |
| 20090138565 | Shiff et al. | May 2009 | A1 |
| 20090144240 | Singh et al. | Jun 2009 | A1 |
| 20090144612 | Ishii et al. | Jun 2009 | A1 |
| 20090171984 | Park et al. | Jul 2009 | A1 |
| 20090192972 | Spivack et al. | Jul 2009 | A1 |
| 20090234711 | Ramer et al. | Sep 2009 | A1 |
| 20090254414 | Schwarz et al. | Oct 2009 | A1 |
| 20090254971 | Herz et al. | Oct 2009 | A1 |
| 20090327304 | Agarwal et al. | Dec 2009 | A1 |
| 20100004975 | White et al. | Jan 2010 | A1 |
| 20100046842 | Conwell | Feb 2010 | A1 |
| 20100049842 | Koski | Feb 2010 | A1 |
| 20100057815 | Spivack et al. | Mar 2010 | A1 |
| 20100070448 | Omoigui | Mar 2010 | A1 |
| 20100070542 | Feinsmith | Mar 2010 | A1 |
| 20100100545 | Jeavons | Apr 2010 | A1 |
| 20100153160 | Bezemer et al. | Jun 2010 | A1 |
| 20100235918 | Mizrahi et al. | Sep 2010 | A1 |
| 20100262592 | Brawer et al. | Oct 2010 | A1 |
| 20100268596 | Wissner et al. | Oct 2010 | A1 |
| 20100268700 | Wissner et al. | Oct 2010 | A1 |
| 20100268702 | Wissner et al. | Oct 2010 | A1 |
| 20100268720 | Spivack et al. | Oct 2010 | A1 |
| 20110066525 | Hulst et al. | Mar 2011 | A1 |
| Number | Date | Country |
|---|---|---|
| 2007094592 | Apr 2007 | JP |
| 20010028737 | Apr 2001 | KR |
| 1020040017824 | Feb 2004 | KR |
| 20050023583 | Mar 2005 | KR |
| 1020060046522 | May 2006 | KR |
| 20060117707 | Nov 2006 | KR |
| 20070061116 | Jun 2007 | KR |
| WO-2010120925 | Oct 2010 | WO |
| WO-2010120929 | Oct 2010 | WO |
| WO-2010120934 | Oct 2010 | WO |
| WO-2010120941 | Oct 2010 | WO |
| Entry |
|---|
| Microsoft Computer Dictionary, p. 181 (Microsoft Press, 5th ed., 2002). |
| Fry, John, and Artificial Intelligence Center. “Assembling a parallel corpus from RSS news feeds.” In MT Summit X, p. 59. 2005. |
| International Search Report PCT/US2008/010337 dated Dec. 28, 2009; pp. 1-3. |
| International Search Report PCT/US2009/002867 dated Dec. 18, 2009; pp. 1-3. |
| Written Opinion PCT/US2009/002867 dated Dec. 18, 2009; pp. 1-5. |
| Written Opinion PCT/US2008/010337 dated Dec. 28, 2009; pp. 1-3. |
| “Improved markup language for semantic Web using object oriented technology, 21-24”—Kangchan, et al., IEEE, vol. 1, Sep. 2003 (pp. 330-334). |
| “Re-integrating the research record”—Myers, et al., IEEE vol. 5, May-Jun. 2003 (pp. 44-50). |
| International Search Report PCT/US2008/010596 dated Mar. 24, 2009. |
| International Search Report PCT/US2008/011474 dated May 29, 2009. |
| International Search Report, PCT/US07/75379, (Aug. 5, 2008). |
| ‘A klog apart’ [online]. Dijest.com 2003. [retrieved on May 3, 2007]. Retrieved from the Internet:<URL:htt://www.dijest.com/aka/2003/06/20.html>, 9 pages. |
| ‘Blogs as information spaces’ [on-line]. Reflective Surface, 2003, [retrieved on May 3, 2007]. Retrieved from the Internet:< URL: http://log.reflectivesurface.com/2003/12/>, 6 pages. |
| Cass, “A Fountain of Knowledge,” IEEE Spectrum, Jan. 2004, pp. 68-75. |
| ‘Development Notebook’ [online]. Dannyayers, 2003, [retrieved on May 4, 2007]. Retrieved from the Internet: URL: webarchive.org/web/20031012055838/http://dannyayers.com/ideagraph-blog/archives/cat—jemblog.html>, 6 pages. |
| “Ontology-Driven Peer Profiling in Peer-to-Peer Enabled Semantic Web”—Olena Parhomenko, Yugyung Lee, E.K. Park—CIKM'03 Information and Knowledge Management Nov. 2003 (pp. 564-567). |
| ‘Organizing weblogs by topic’ [online]. Read/Write Web, 2003, [retrieved on May 3, 2007]. Retrieved from the Internet: <URL: http://www.readwriteweb.com/archives/organizing—webl.php>, 3 pages. |
| “SEAL—A Framework for Developing SEmanatic PortALS”—Nenad Stojanovic, Aleanander Maedche, Steffen Staab, Rudi Studer and York Sure, K=CAP‘01’, Oct. 22-23, 2001 ACM (pp. 155-162). |
| “Semantic-Based Approach to Component Retrieval”—Vijayan Sugumaran and Veda C. Storey—ACM SIGMIS Database vol. 24, Issue 3 Aug. 2003 (pp. 8-24). |
| ‘Semantic Email’ [online]. University of Washington, [retrieved on May 3, 2007]. Retrieved from thee Internet:<URL:http://www.cs.washington.edu/research/semweb/email.html>, 2 pages. |
| ‘Semantic link’ [online]. Meta, 2005 [retrieved on May 3, 2007]. Retrieved on the Internet:<URL: http://meta.wikimedia.org/wiki/Semantic—link>, 1 page. |
| Written Opinion PCT/US2010/031090 dated Nov. 2, pp. 1-6. |
| Milos Kudelka, et al., “Semantic Annotation of Web Pages Using Web Patterns”, IEEE/WIC/ACM/ International Conference on Web Intelligence, pp. 1-12, Dec. 18, 2006. |
| International Search Report PCT/US2010/031096 dated Nov. 22, 2010; pp. 1-3. |
| International Search Report PCT/US2010/031101dated Nov. 26, 2010; pp. 1-3. |
| International Search Report PCT/US2010/031111 dated Nov. 26, 2010; pp. 1-3. |
| International Search Report PCT/US2010/039381 dated Jan. 5, 2011 pp. 1-3. |
| Non-Final Office Action mailed Nov. 29, 2010 in U.S. Appl. No. 11/874,881, filed Oct. 18, 2007. |
| European Supplementary Search Report EP 08 839486.1 Dated Dec. 27, 2010 pp. 1-8. |
| U.S. Appl. No. 12/819,999, filed Jun. 21, 2010. |
| U.S. Appl. No. 60/546,794, filed Feb. 23, 2004. |
| U.S. Appl. No. 60/427,550, filed Nov. 20, 2002. |
| U.S. Appl. No. 60/821,891, filed Aug. 9, 2006. |
| U.S. Appl. No. 60/972,815, filed Sep. 16, 2007. |
| U.S. Appl. No. 60/981,104, filed Oct. 18, 2007. |
| U.S. Appl. No. 61/169,662, filed Apr. 15, 2009. |
| U.S. Appl. No. 61/169,669, filed Apr. 15, 2009. |
| U.S. Appl. No. 61/169,677, filed Apr. 15, 2009. |
| U.S. Appl. No. 61/218,709, filed Jun. 19, 2009. |
| Written Opinion PCT/US2010/031096 dated Nov. 22, 2010dated; pp. 1-4. |
| Written Opinion PCT/US2010/031101 dated Nov. 26, 2010dated; pp. 1-4. |
| Written Opinion PCT/US2010/031111 dated Nov. 26, 2010dated; pp. 1-6. |
| Written Opinion PCT/US2010/039381 dated Jan. 5, 2011 pp. 1-4. |
| Final Office Action mailed May 11, 2011 for U.S. Appl. No. 11/874,881, filed Oct. 18, 2007, 31 pages. |
| Final Office Action mailed Apr. 25, 2011 for U.S. Appl. No. 12/616,085, filed Nov. 10, 2009, 32 pages. |
| Non-Final Office Action mailed May 11, 2011 for U.S. Appl. No. 11/874,882, filed Oct. 18, 2007, 30 pages. |
| Restriction Requirement mailed Mar. 10, 2011 for U.S. Appl. No. 11/874,882, filed Oct. 18, 2007, 5 pages. |
| Non-Final Office Action mailed Mar. 18, 2011 in U.S. Appl. No. 12/359,236, filed Jan. 23, 2009, 33 pages. |
| Choi, Y. et al., “Refinement Method of Post-Processing and Training for Improvement of Automated Text Classification,” Computational Science and its Applications—ICCSA 2006: International Conference, Glasgow, UK, May 8-11, 2006, pp. 298-308. |
| Restriction Requirement mailed Jul. 26, 2011 for U.S. Appl. No. 12/244,740, filed Oct. 2, 2008, 5 pages. |
| Restriction Requirement mailed Aug. 5, 2011 in U.S. Appl. No. 12/168,034, filed Jul. 3, 2008, 10 pages. |
| Final Office Action mailed Aug. 17, 2011 for U.S. Appl. No. 12/616,085, filed Nov. 10, 2009, 23 pages. |
| Final Office Action mailed Aug. 17, 2011 for U.S. Appl. No. 12/197,207, filed Aug. 22, 2008, 41 pages. |
| Non Final Office Action mailed Jul. 27, 2011 for U.S. Appl. No. 12/489,352, filed Jun. 22, 2009, 28 pages. |
| Final Office Action mailed May 26, 2011 in U.S. Appl. No. 11/873,388, filed Oct. 16, 2007, 28 pages. |
| Advisory Action mailed Oct. 20, 2011 for U.S. Appl. No. 11/874,881, 3 pages. |
| U.S. Appl. No. 11/873,388, filed Oct. 16, 2007. |
| U.S. Appl. No. 11/874,881, filed Oct. 18, 2007. |
| U.S. Appl. No. 11/874,882, filed Oct. 18, 2007. |
| U.S. Appl. No. 12/168,034, filed Jul. 3, 2008. |
| U.S. Appl. No. 12/244,740, filed Oct. 2, 2008. |
| U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| U.S. Appl. No. 12/616,085, filed Nov. 10, 2009. |
| U.S. Appl. No. 10/719,652, filed Nov. 20, 2003. |
| U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| U.S. Appl. No. 12/359,236, filed Jan. 23, 2009. |
| U.S. Appl. No. 12/359,230, filed Jan. 23, 2009. |
| U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| U.S. Appl. No. 12/197,207, filed Aug. 22, 2008. |
| U.S. Appl. No. 11/835,079, filed Aug. 7, 2007. |
| U.S. Appl. No. 12/104,366, filed Apr. 16, 2008. |
| U.S. Appl. No. 12/760,387, filed Apr. 14, 2010. |
| U.S. Appl. No. 12/760,411. filed Apr. 14, 2010. |
| U.S. Appl. No. 12/760,424, filed Apr. 14, 2010. |
| U.S. Appl. No. 12/489,352, filed Jun. 22, 2009. |
| Non-Final Office Action mailed May 22, 2006 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 11, 2003. |
| Non-Final Office Action mailed Nov. 16, 2006 for Issued Patent No. 7,640, 267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Final Office Action mailed May 1, 2007 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Appeals Conference mailed Proceed to BPAI mailed Oct. 17, 2007 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Feb. 20, 2008 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Final Office Action mailed Aug. 5, 2008 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Advisory Action mailed Oct. 17, 2008 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Feb. 3, 2009 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Restriction Requirement mailed May 26, 2009 for Issued Patent No. 7,640,267, U.S. Appl. No. 10,719,002, filed Nov. 20, 2003. |
| Notice of Allowance mailed Sep. 16, 2009 for Issued patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Notice of Allowance mailed Nov. 4, 2009 for Issued Patent No. 7,640,267, U.S. Appl. No. 10/719,002, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Feb. 7, 2007 for Issued Patent No. 7,398,261, U.S. Appl. No. 10/719,652, filed Nov. 20, 2003. |
| Non-Compliant or Non-Responsive Amendment mailed Nov. 20, 2007 for Issued Patent No. 7,398,261, U.S. Appl. No. 10,719,652, filed Nov. 20, 2003. |
| Notice of Allowance mailed Mar. 25, 2008 for Issued Patent No. 7,398,261, U.S. Appl. No. 10,719,652, filed Nov. 2003. |
| Non-Final Office Action mailed May 5, 2006 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Final Office Action mailed Oct. 20, 2006 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Aug. 9, 2007 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Final Office Action mailed Jan. 25, 2008 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Non-Final Office Action mailed May 23, 2008 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Nov. 21, 2008 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Notice of Allowance mailed May 13, 2009 for Issued Patent No. 7,584,208, U.S. Appl. No. 10/720,031, filed Nov. 20, 2003. |
| Non-Final Office Action mailed Apr. 14, 2010 in U.S. Appl. No. 12/359,230, filed Jan. 23, 2009. |
| Non-Final Office Action mailed Jun. 1, 2007 for Issued Patent No. 7,433,876, U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| Final Office Action mailed Nov. 9, 2007 for Issued Patent No. 7,433,876, U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| Advisory Action mailed Dec. 27, 2007 for Issued Patent No. 7,433,876, U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| Non-Final Office Action mailed Feb. 21, 2008 for Issued Patent No. 7,433,876, U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| Notice of Allowance mailed Jun. 26, 2008 for Issued Patent No. 7,433,876, U.S. Appl. No. 11/062,125, filed Feb. 19, 2005. |
| Non-Final Office Action mailed Feb. 3, 2010 in U.S. Appl. No. 12/104,366, filed Apr. 16, 2008. |
| Final Office Action mailed Jul. 16, 2010 in U.S. Appl. No. 12/104,366, filed Apr. 16, 2008. |
| Non-Final Office Action mailed Aug. 25, 2010 in U.S. Appl. No. 11/835,079, filed Aug. 7, 2007. |
| Final Office Action mailed Feb. 14, 2011 in U.S. Appl. No. 11/835,079, filed Aug. 7, 2007, 63 pages. |
| Non-Final Office Action mailed Sep. 16, 2010 in U.S. Appl. No. 11/873,388, filed Oct. 16, 2007. |
| Restriction Requirement mailed Jul. 23, 2010 in U.S. Appl. No. 11/874,881, filed Oct. 18, 2007. |
| Final Office Action mailed Sep. 29, 2010 in U.S. Appl. No. 12/359,230, filed Jan. 23, 2009. |
| Non-Final Office Action mailed Sep. 17, 2010 in U.S. Appl. No. 12/616,085, filed Nov. 10, 2009. |
| International Search Report and Written Opinion PCT/US2008/010337 dated Dec. 28, 2009, pp. 1-10. |
| International Search Report and Written Opinion PCT/US2009/002867 dated Dec. 18, 2009; pp. 1-12. |
| International Search Report PCT/US07/75379; dated Aug. 5, 2008. |
| Written Opinion PCT/US2007/75379 dated Aug. 5, 2008, pp. 1-7. |
| International Search Report PCT/US2008/010596 dated Mar. 24, 2009, pp. 1-2. |
| Written Opinion PCT/US208/010596 dated Mar. 24, 2009, pp. 1-5. |
| International Search Report PCT/US2008/011474 dated May 29, 2009, pp. 1-2. |
| Written Opinion PCT/US2008/011474 dated May 29, 2009, pp. 1-5. |
| International Search Report PCT/US2010/031090 dated Nov. 2, 2010; pp. 1-3. |
| Written Opinion PCT/US2010/031090; dated Nov. 2, 2010; pp. 1-6. |
| Written Opinion PCT/US2010/031111 dated Nov. 26, 2010; pp. 1-6. |
| A. Hotho et al., “Information Retrieval in Folksonomies: Search and Ranking,” ESWC 2006, pp. 411-426. |
| Kashyap et al., “Semantic and Schematic Similarities Between Database Objects: a Context-Based Approach,” VLUB Journal, © Springer-Verlag 1996, pp. 1-29. |
| S. Decker and M. Frank, “The Social Semantic Desktop,” DERI Technical Report 2004-05-02, May 2004, pp. 1-7. |
| W. Fang et al., “Toward a Semantic Search Engine Based on Ontologies,” Proc. 4th Int'l Conf. on Mach. Learning and Cybernetics, Aug. 2005, pp. 1913-1918. |
| X. Shen, B. Tan, and X. Zhai, “Context-sensitive information retrieval using implicit feedback,” Proc. SIGIR '05, pp. 43-50. |
| Lee et al., “Development of a Concurrent Mold Design System: a Knowledge-based Approach,” Computer Integrated Manufacturing Systems, vol. 10, Issue 4, Oct. 1997, pp. 287-307. |
| Deved{hacek over (z)}ić, “A Survey of Modern Knowledge Modeling Techniques,” Expert Systems with Applications, vol. 17, Issue 4, Nov. 1999, pp. 275-294. |
| Number | Date | Country | |
|---|---|---|---|
| 20080189267 A1 | Aug 2008 | US |
| Number | Date | Country | |
|---|---|---|---|
| 60821891 | Aug 2006 | US |