Methods for automatic footnote generation

Description

BACKGROUND

The disclosure herein relates generally to modifying documents to include links to external information.

When reading a non-fiction document, a reader may often wonder about the reliability of statements made by an author. In formal writing, authors typically include footnotes that identify primary sources, such as research papers or books that provide support for statements of fact.

In many documents that are available via the internet, authors make statements of fact without identifying a source of reliable information that supports the statement. These documents include, for example, encyclopedia pages, blogs, news articles, advocacy group web pages, and responses on answer forums. Thus, readers are not provided with a convenient source of information with which to verify the statements made in the document.

SUMMARY

The disclosure relates to methods for automatic footnote generation.

One aspect of the disclosed embodiments is a method that includes accessing, at one or more computing devices, a document. The method also includes generating, using the one or more computing devices, a ranking score for each of a plurality of passages from external documents. The ranking score is based at least on a degree of semantic similarity of each passage with respect to a portion of the document. The method also includes modifying, using the one or more computing devices, the document to include a footnote link for the portion of the document, the footnote link including a link to the external document having the highest ranked passage therein, if the ranking score of the highest ranked passage with respect to the portion of the document exceeds a threshold value. The document is not modified to include the footnote link for the portion of the document if the ranking score of the highest ranked passage with respect to the portion of the document does not exceed a threshold value.

Another aspect of the disclosed embodiments is a non-transitory storage medium including program instructions executable by one or more processors that, when executed, cause the one or more processors to perform operations. The operations include accessing, at one or more computing devices, a document; generating, using the one or more computing devices, a ranking score for each of a plurality of passages from external documents, wherein the ranking score is based at least on a degree of semantic similarity of each passage with respect to a portion of the document; and modifying, using the one or more computing devices, the document to include a footnote link for the portion of the document, the footnote link including a link to the external document having the highest ranked passage therein, if the ranking score of the highest ranked passage with respect to the portion of the document exceeds a threshold value, wherein the document is not modified to include the footnote link for the portion of the document if the ranking score of the highest ranked passage with respect to the portion of the document does not exceed a threshold value.

Another aspect of the disclosed embodiments is an apparatus that includes one or more processors and one or more memory devices for storing program instructions used by the one or more processors. The program instructions, when executed by the one or more processors, cause the one or more processors to: access, at one or more computing devices, a document; generate, using the one or more computing devices, a ranking score for each of a plurality of passages from external documents, wherein the ranking score is based at least on a degree of semantic similarity of each passage with respect to a portion of the document; and modify, using the one or more computing devices, the document to include a footnote link for the portion of the document, the footnote link including a link to the external document having the highest ranked passage therein, if the ranking score of the highest ranked passage with respect to the portion of the document exceeds a threshold value, wherein the document is not modified to include the footnote link for the portion of the document if the ranking score of the highest ranked passage with respect to the portion of the document does not exceed a threshold value.

BRIEF DESCRIPTION OF THE DRAWINGS

The description herein makes reference to the accompanying drawings wherein like reference numerals refer to like parts throughout the several views, and wherein:

FIG. 1 is a block diagram showing an example of a system for automatic footnote generation;

FIG. 2 is a block diagram showing an example of a server computer;

FIG. 3 is an illustration showing an example of identifying and ranking passages from external documents;

FIG. 4 is an illustration showing an example of modification of a subject document; and

FIG. 5 is a flow chart showing an example of a process for automatic footnote generation.

DETAILED DESCRIPTION

In order to verify statements of fact made in an online document, a reader might navigate to a search engine, formulate a search query, and then browse the results generated by the search engine. This solution is not ideal, because formulating a query at the right level of specificity is difficult, and the user may need to try multiple queries. In addition, finding relevant passages in the results returned by the search engine can be time consuming. The systems and methods described herein are directed to automatic footnote generation in online documents. Using the systems and methods described herein, some of the statements made in online documents are automatically annotated with links to relevant external documents that include passages that support or dispute the statement. In the systems and methods herein, statements in the document are compared to passages from external documents to determine a degree of semantic similarity between each statement and the passages from external documents. The statements can be annotated by adding links to one or more of the passages. In some implementations, a determination is made for each statement as to whether or not it should be annotated. In some implementations, ranking is applied to the passages from the external documents to determine which of the external documents should be referenced in a footnote link. The ranking applied to the passages can be based on, for example, semantic similarity, authoritativeness, and/or recency of the passages.

FIG. 1 shows an example of an environment 100 in which a system for automatic footnote generation can be implemented. The environment 100 can include a user system 110 and an annotation system 130. The user system 110 is representative of a large number (e.g. millions) of user systems that can be included in the environment 100. The user system 110 can be any manner of computer or computing device, such as a desktop computer, a laptop computer, a tablet computer, or a smart-phone (a computationally-enabled mobile telephone). The annotation system 130 can be implemented using one or more server computers 140. The user system 110 and the annotation system 130 can each be implemented as a single system, multiple systems, distributed systems, or in any other form.

The systems, services, servers, and other computing devices described herein are in communication via a network 150. The network 150 can be one or more communications networks of any suitable type in any combination, including wireless networks, wired networks, local area networks, wide area networks, cellular data networks, and the internet.

The annotation system 130 provides an annotation service to the user system 110. In some implementations, all of the operations described herein with respect to automatically generating footnotes and annotating an online document are performed at the annotation system 130. In other implementations, some of the operations described herein are performed at the annotation system 130, and the other operations are performed at the user system 110.

FIG. 2 is a block diagram of an example of a hardware configuration for the one or more server computers 140. The same hardware configuration or a similar hardware configuration can be used to implement the user system 110. Each server computer 140 can include a CPU 210. The CPU 210 can be a conventional central processing unit. Alternatively, the CPU 210 can be any other type of device, or multiple devices, capable of manipulating or processing information now-existing or hereafter developed. Although the disclosed examples can be practiced with a single processor as shown, e.g. CPU 210, advantages in speed and efficiency can be achieved using more than one processor.

Each server computer 140 can include memory 220, such as a random access memory device (RAM). Any other suitable type of storage device can be used as the memory 220. The memory 220 can include code and data 222 that can be accessed by the CPU 210 using a bus 230. The memory 220 can further include one or more application programs 224 and an operating system 226. The application programs 224 can include software components in the form of computer executable program instructions that cause the CPU 210 to perform the operations and methods described herein.

A storage device 240 can be optionally provided in the form of any suitable computer readable medium, such as a hard disc drive, a memory device, a flash drive or an optical drive. One or more input devices 250, such as a keyboard, a mouse, or a gesture sensitive input device, receive user inputs and can output signals or data indicative of the user inputs to the CPU 210. One or more output devices can be provided, such as a display device 260. The display device 260, such as liquid crystal display (LCD) or a cathode-ray tube (CRT), allows output to be presented to a user, for example, in response to receiving a video signal.

Although FIG. 2 depicts the CPU 210 and the memory 220 of each server computer 140 as being integrated into a single unit, other configurations can be utilized. The operations of the CPU 210 can be distributed across multiple machines (each machine having one or more of processors) which can be coupled directly or across a local area or other network. The memory 220 can be distributed across multiple machines such as network-based memory or memory in multiple machines. Although depicted here as a single bus, the bus 230 of each server computer 140 can be composed of multiple buses. Further, the storage device 240 can be directly coupled to the other components of the respective server computer 140 or can be accessed via a network and can comprise a single integrated unit such as a memory card or multiple units such as multiple memory cards. The one or more server computers 140 can thus be implemented in a wide variety of configurations.

FIG. 3 is an illustration showing an example of an identifying and ranking operation 300. The identifying and ranking operation 300 can be performed, for example, by the one or more server computers 140 of the annotation system 130.

The identifying and ranking operation 300 is performed with respect to a subject document 310. The subject document 310 can be any type of document. As an example, the subject document 310 can be a webpage that is encoded in hypertext markup language (HTML). The subject document 310 can be accessed, for example, by the one or more server computers 140 of the annotation system 130. As one example, the one or more server computers 140 of the annotation system 130 can access the subject document 310 by receiving the subject document 310 from the user system 110. As another example, the one or more server computers 140 of the annotation system 130 can access the subject document 310 by receiving the subject document 310 from an external server computer via the network 310, which can be the internet. As another example, the one or more server computers 140 of the annotation system 130 can access the subject document 310 by receiving the subject document 310 from a storage device that is associated with the one or more server computers 140.

The subject document 310 can be divided into a plurality of document portions 320, each of which are subjected to the identifying and ranking operation 300. In one example, the document portions 320 are sentences. In this example, the subject document 310 can be divided into the document portions 320 by parsing the text contained within the subject document 310, identifying individual document portions 320 based on delimiters such as punctuation, and then storing the delimited portions of the subject document 310 as the document portions 320.

Each document portion 320 is received as an input at a matching component 330. As an example, the matching component 330 can be implemented in the form of software that is executed by the one or more server computers 140 of the annotation system 130. The matching component 330 is operable to access a repository of external documents 340 and identify a plurality of passages 350 from the external documents 340 that are relevant to the document portion 320. The matching component 330 identifies the passages 350 from the external documents 340 based on relevance or similarity of the passages 350 with respect to the document portion 320. In particular, the matching component 330 can implement a search function that is based on any of a variety of well-known search algorithms to identify the passages 350. As an example, the passages 350 can be identified using a subset of words from the document portion 320 as an input for a search function, or by using an entirety of the document portion 320 as an input for the search function.

In one implementation, the matching component 330 implements a semantic search algorithm that identifies the passages 350 based on semantic similarity between the document portion 320 and the passages 350. By way of example, the matching component 330 can incorporate or utilize a search engine that indexes the external documents 340 and assesses their relevance relative to the document portion 320. For each of the external documents 340 that the matching component 330 identifies as being relevant, the portions thereof that are relevant, such as by way of semantic similarity, to the document portion 320 are extracted as the passages 350. By way of example, the passages 350 can be one or more sentences or paragraphs from the external documents 340. In examples where the document portions 320 are sentences, semantic similarity can be assessed by comparing the entirety of the document portion 320 (i.e. the entirety of the sentence) to the passages 350.

The document portion 320 and the passages 350 are provided as inputs to a ranking component 360. The ranking component 360 generates a ranking score 370 for each of the passages 350. The ranking score 370 for each passage is based at least in part on semantic similarity of the document portion 320 and the respective one of the passages 350. For example, the ranking component 360 can assess semantic similarity by parsing each of the document portion 320 and the passages 350 to identify concepts conveyed in each, apply a numerical relatedness score to pairs of the concepts where each pair includes a concept from the document portion 320 and a concept from one of the passages 350, and subsequently generating a semantic relatedness score based on the scores assigned to concept pairs. Other algorithms can be utilized to determine semantic relatedness. The ranking scores 370 are based, at least in part, on the semantic relatedness score. In addition, the rankings 370 can be based on an authoritativeness rating for the external document 340 from which each passage 350 was extracted and/or an author related with the external documents 340. As an example, the authoritativeness rating can be calculated using an algorithm similar to the PageRank algorithm, as described in “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” by Sergey Brin and Lawrence Page, Computer Networks and ISDN Systems, 33: 107-17, 1998. The ranking score 370 can be further based on a recency score, such as one based on the time elapsed since of creation or modification of the respective one of the external documents 340 from which the passage 350 was extracted. In one example, the ranking score 370 can be computed as a weighted average of the semantic similarity score, the authoritativeness score, and the recency score.

FIG. 4 is an illustration showing an example of a document modification operation 400. The document modification operation 400 can be performed, for example, by the one or more server computers 140 of the annotation system 130.

In the document modification operation 400, the subject document 310 and the ranking scores 370 are provided as inputs to a modification component 410. For each document portion 320 of the subject document 310, the modification component 410 makes a determination as to whether to annotate the respective document portion 320 of the subject document 310 by insertion of a footnote link for the document portion 320.

In one implementation, the modification component 310 modifies each document portion 320 of the subject document 310 to include a footnote link to the highest ranked passage 350 from the external documents 340 based on the rankings 370.

In another implementation, the modification component accesses the ranking score 370 for the highest ranked passage 350 from the external documents 340. The modification component 410 compares the ranking score 370 for the highest ranked one of the passages 350 to a threshold value. If the ranking score 370 exceeds the threshold value, the subject document 310 is modified to include a footnote link for the document portion 320. For example, in an HTML document, a hyperlink can be inserted within or adjacent to the document portion 320, where the hyperlink references a URL corresponding to the external document 340 in which the passage 350 can be found. As another example, the footnote link can include a pop-up interface element that is displayed within the context of the document and shows the passage 350 while giving the user the option to navigate to the external document 340 in which the passage 350 can be found. For example, in an HTML document, this can be done by modifying the HTML document and including code portions therein that cause display of the interactive pop-up element, such as JavaScript code. The foregoing implementations are given as examples, and it should be understood that other types of footnote links can be implemented. If the ranking score 370 for the highest ranked one of the passages 350 does not exceed the threshold value, the subject document 310 is not modified to include the footnote link for the document portion 320.

In some implementations, the document modification operation is performed at the one or more server computers 140 of the annotation service 130. In such an implementation, a copy of a subject document 310 is made at the annotation service 130. A subject document 310 is modified at the annotation service 130, and a modified document 420 is transmitted from the annotation service 130 to the user system 110.

In other implementations, the document modification operation 400 is performed at the user system 110. As an example, the document modification operation 400 can be implemented by software that is executed at the user system 110, such as by way of a plug-in for a web browser software program. In one implementation, the subject document 310 can be received at the user system 110, the rankings 370 are received by a transmission of the ranking scores 370 from the one or more server computers 140 of the annotation service 130, and the modification component 410 is executed at the user system 110 with respect to a copy of the subject document 310 that is present at the user system 110 to produce the modified document 420 at the user system 110. In another implementation, the subject document 310 can be received at the user system 110, the rankings 370 are utilized by the one or more server computers 140 of the annotation service 130 to generate information describing one or more document modifications, and the information describing the one or more document modification operations is transmitted to the user system 110. The information describing the one or more document modification operations, when executed by the modification component 410 at the user system 110, causes the user system 110 to annotate the subject document 310 to produce the modified document 420.

FIG. 5 is a flow chart showing a process 500 for automatic footnote generation. The operations described in connection with the process 500 can be performed at one or more computers, such as at the one or more server computers 140 of the annotation system 130. As used herein, the phrases “one or more computers,” “one or more computing devices,” “one or more server computers,” and similar phrases include all of the computers or groups of computers that participate in performing the process. For example, the process 500 can be performed at one or more computers in an implementation where each of the operations described herein is performed using a different group of computers, where each group of computers cooperatively performs the respective operation of the process. When an operation is performed by one or more computers, it is completed when it is performed by one computer. The operations described in connection with the first example process 500 can be embodied as a non-transitory computer readable storage medium including program instructions executable by one or more processors that, when executed, cause the one or more processors to perform the operations. For example, the operations described in connection with the first example process 500 could be stored at the memory 220 of a respective one of the server computers 140 and be executable by the CPU 210 thereof.

At operation 510, a subject document is received. As an example, the subject document 310 can be received at the one or more server computers 140 of the annotation service 130.

At operation 520, document portions are extracted from the subject document. As an example, the document portions 320 can be extracted from the subject document 310 by operations such as text parsing. In some implementations, the document portions 320 are sentences that can be extracted by analysis of the text contained within the subject document 310 including delimiters such as punctuation.

At operation 530, external documents are matched to the document portion. As previously explained, external documents can be matched to the document portion based on relevance, such as a degree of semantic similarity between the external documents and the document portion. By way of example, the matching component 330 can match the external documents 340 to the document portion 320 to identify relevant passages 350 from the external documents 340 as previously described.

At operation 540, the passages from the documents identified at operation 530 are ranked. As an example, a ranking component 360 can rank the passages 350 based on multiple factors including at least a degree of semantic similarity between the document portion 320 and the passages 350, as previously described with respect to the ranking component 360. As previously discussed, the output of operation 540 can be the ranking scores 370 for the passages 350.

At operation 550, a determination is made as to whether a threshold is satisfied by the rankings for the passages, as ranked in operation 540. For example, the ranking scores 370 that were generated by the ranking component 360 can be compared to a threshold as discussed in connection with the modification component 410. If the threshold is satisfied, the process continues to operation 560 where the document is modified to include a footnote link. This can be performed in the manner discussed in connection with the modification component 410, for example, by inserting the footnote link into a copy of the subject document 310 to generate the modified document 420.

Subsequent to modification of the document at operation 560 or if the threshold was not satisfied at operation 550, the process continues to operation 570. At operation 570, a determination is made as to whether more document portions are contained within the subject document 310 with respect to which the automated annotation process has not yet been performed. If more document portions exist for analysis, the process returns to operation 530. If all of the document portions have been analyzed, the process ends.

The foregoing description describes only some exemplary implementations of the described techniques. Other implementations are available. For example, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Further, the system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements. Also, the particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead performed by a single component.

The words “example” or “exemplary” are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “example’ or “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the words “example” or “exemplary” is intended to present concepts in a concrete fashion. As used in this application, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or”. That is, unless specified otherwise, or clear from context, “X includes A or B” is intended to mean any of the natural inclusive permutations. That is, if X includes A; X includes B; or X includes both A and B, then “X includes A or B” is satisfied under any of the foregoing instances. In addition, the articles “a” and “an” as used in this application and the appended claims should generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form. Moreover, use of the term “an embodiment” or “one embodiment” or “an implementation” or “one implementation” throughout is not intended to mean the same embodiment or implementation unless described as such.

The implementations of the computer devices (e.g., clients and servers) described herein can be realized in hardware, software, or any combination thereof. The hardware can include, for example, computers, intellectual property (IP) cores, application-specific integrated circuits (ASICs), programmable logic arrays, optical processors, programmable logic controllers, microcode, microcontrollers, servers, microprocessors, digital signal processors or any other suitable circuit. In the claims, the term “processor” should be understood as encompassing any of the foregoing hardware, either singly or in combination. The terms “signal” and “data” are used interchangeably. Further, portions of each of the clients and each of the servers described herein do not necessarily have to be implemented in the same manner.

Operations that are described as being performed by a single processor, computer, or device can be distributed across a number of different processors, computers or devices. Similarly, operations that are described as being performed by different processors, computers, or devices can, in some cases, be performed by a single processor, computer or device.

Although features may be described above or claimed as acting in certain combinations, one or more features of a combination can in some cases be excised from the combination, and the combination may be directed to a sub-combination or variation of a sub-combination.

The systems described herein, such as client computers and server computers, can be implemented using general purpose computers/processors with a computer program that, when executed, carries out any of the respective methods, algorithms and/or instructions described herein. In addition or alternatively, for example, special purpose computers/processors can be utilized which can contain specialized hardware for carrying out any of the methods, algorithms, or instructions described herein.

Some portions of above description include disclosure presented in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. These operations, while described functionally or logically, are understood to be implemented by computer programs. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules or by functional names, without loss of generality. It should be noted that the process steps and instructions of implementations of this disclosure could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.

Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.

At least one implementation of this disclosure relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored on a computer readable storage medium that can be accessed by the computer.

All or a portion of the embodiments of the disclosure can take the form of a computer program product accessible from, for example, a non-transitory computer-usable or computer-readable medium. The computer program, when executed, can carry out any of the respective techniques, algorithms and/or instructions described herein. A non-transitory computer-usable or computer-readable medium can be any device that can, for example, tangibly contain, store, communicate, or transport the program for use by or in connection with any processor. The non-transitory medium can be, for example, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for tangibly containing, storing, communicating, or transporting electronic instructions.

It is to be understood that the disclosure is not to be limited to the disclosed embodiments but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A method comprising: accessing, at one or more computing devices, a document;identifying a plurality of sentences of the document, each sentence identified based on punctuation in the document; andexecuting, for each of the sentences of the document and using the one or more computing devices, a document modification operation that includes: generating a ranking score for each of a plurality of passages from external documents, wherein the ranking score is based at least on a degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document,modifying the sentence to include a footnote link for the sentence in the document, the footnote link including a link to the external document having a highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence exceeds a threshold value, andskipping modification of the sentence if the ranking score of the highest ranked passage with respect to the sentence does not exceed the threshold value.
2. The method of claim 1, wherein the degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document is based on the entirety of the sentence.
3. The method of claim 1, wherein the ranking score each of the plurality of passages from external documents is based further on an authoritativeness rating for each of the external documents.
4. The method of claim 1, wherein the plurality of passages from external documents are identified using a subset of words from the sentence of the document as an input for a search function.
5. The method of claim 1, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the client computing device based on data received from the server computing device, and modifying the sentence to include the footnote link is performed at the client computing device.
6. The method of claim 1, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the server computing device, modifying the sentence to include the footnote link is performed at the server computing device, and a modified document is transmitted from the server computing device to the client computing device.
7. The method of claim 1, wherein modifying the sentence to include the footnote link further includes generating a pop-up interface element for the sentence in the document, the pop-up interface element displaying the highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence exceeds the threshold value.
8. A hardware computer-readable storage medium including program instructions executable by one or more processors that, when executed, cause the one or more processors to perform operations, the operations comprising: accessing, at one or more computing devices, a document;identifying a plurality of sentences of the document, each sentence identified based on punctuation in the document; andexecuting, for each of the sentences of the document and using the one or more computing devices, a document modification operation that includes: generating a ranking score for each of a plurality of passages from external documents, wherein the ranking score is based at least on a degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document,modifying the sentence to include a footnote link for the sentence in the document, the footnote link including a link to the external document having a highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence exceeds a threshold value, andskipping modification of the sentence if the ranking score of the highest ranked passage with respect to the sentence does not exceed the threshold value.
9. The hardware computer-readable storage medium of claim 8, wherein the degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document is based on the entirety of the sentence.
10. The hardware computer-readable storage medium of claim 8, wherein the ranking score each of the plurality of passages from external documents is based further on an authoritativeness rating for each of the external documents.
11. The hardware computer-readable storage medium of claim 8, wherein the plurality of passages from external documents are identified using a subset of words from the sentence of the document as an input for a search function.
12. The hardware computer-readable storage medium of claim 8, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the client computing device based on data received from the server computing device, and modifying the sentence to include the footnote link is performed at the client computing device.
13. The hardware computer-readable storage medium of claim 8, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the server computing device, modifying the sentence to include the footnote link is performed at the server computing device, and a modified document is transmitted from the server computing device to the client computing device.
14. The hardware computer-readable storage medium of claim 8, wherein modifying the sentence to include the footnote link further includes generating a pop-up interface element for the sentence in the document, the pop-up interface element displaying the highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence in exceeds the threshold value.
15. An apparatus, comprising: one or more processors; andone or more memory devices for storing program instructions used by the one or more processors, wherein the program instructions, when executed by the one or more processors, cause the one or more processors to: access, at one or more computing devices, a document,identify a plurality of sentences in the document, each sentence identified based on punctuation in the document; andexecute, for each of the sentences of the document and using the one or more computing devices, a document modification operation that causes the one or more processors to: generate a ranking score for each of a plurality of passages from external documents, wherein the ranking score is based at least on a degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document,modify the sentence to include a footnote link for the sentence-in the document, the footnote link including a link to the external document having a highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence exceeds a threshold value, andskip modification of the sentence if the ranking score of the highest ranked passage with respect to the sentence does not exceed the threshold value.
16. The apparatus of claim 15 wherein the degree of semantic similarity of each of the plurality of passages from the external documents with respect to the sentence of the document is based on the entirety of the sentence.
17. The apparatus of claim 15, wherein the ranking score each of the plurality of passages from external documents is based further on an authoritativeness rating for each of the external documents.
18. The apparatus of claim 15, wherein the plurality of passages from external documents are identified using a subset of words from the sentence of the document as an input for a search function.
19. The apparatus of claim 15, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the client computing device based on data received from the server computing device, and modifying the sentence to include the footnote link is performed at the client computing device.
20. The apparatus of claim 15, wherein the one or more computing devices include a server computing device and a client computing device, generating the ranking score for each of a plurality of passages from external documents is performed at the server computing device, modifying the sentence to include the footnote link is performed at the server computing device, and a modified document is transmitted from the server computing device to the client computing device.
21. The apparatus of claim 15, wherein modifying the sentence to include the footnote link further includes generating a pop-up interface element for the sentence in the document, the pop-up interface element displaying the highest ranked passage therein if the ranking score of the highest ranked passage with respect to the sentence exceeds the threshold value.

US Referenced Citations (257)

Number	Name	Date	Kind
5280367	Zuniga	Jan 1994	A
5448695	Douglas et al.	Sep 1995	A
5493692	Theimer et al.	Feb 1996	A
5544049	Henderson et al.	Aug 1996	A
5600778	Swanson et al.	Feb 1997	A
5613163	Marron et al.	Mar 1997	A
5721849	Amro	Feb 1998	A
5790127	Anderson et al.	Aug 1998	A
5821928	Melkus et al.	Oct 1998	A
5826015	Schmidt	Oct 1998	A
5845300	Comer et al.	Dec 1998	A
5859640	de Judicibus	Jan 1999	A
5877763	Berry et al.	Mar 1999	A
5883626	Glaser et al.	Mar 1999	A
5905991	Reynolds	May 1999	A
5966121	Hubbell et al.	Oct 1999	A
6005575	Colleran et al.	Dec 1999	A
6018341	Berry et al.	Jan 2000	A
6272490	Yamakita	Aug 2001	B1
6295542	Corbin	Sep 2001	B1
6301573	McIlwaine et al.	Oct 2001	B1
6377965	Hachamovitch et al.	Apr 2002	B1
6421678	Smiga et al.	Jul 2002	B2
6463078	Engstrom et al.	Oct 2002	B1
6546393	Khan	Apr 2003	B1
6564213	Ortega et al.	May 2003	B1
6647383	August et al.	Nov 2003	B1
6654038	Gajewska et al.	Nov 2003	B1
6751604	Barney et al.	Jun 2004	B2
6789251	Johnson	Sep 2004	B1
6820075	Shanahan et al.	Nov 2004	B2
6865714	Liu et al.	Mar 2005	B1
6889337	Yee	May 2005	B1
6907447	Cooperman et al.	Jun 2005	B1
6980977	Hoshi et al.	Dec 2005	B2
7003506	Fisk et al.	Feb 2006	B1
7003737	Chiu et al.	Feb 2006	B2
7031963	Bae	Apr 2006	B1
7051277	Kephart et al.	May 2006	B2
7073129	Robarts et al.	Jul 2006	B1
7103835	Yankovich et al.	Sep 2006	B1
7117432	Shanahan	Oct 2006	B1
7127674	Carroll et al.	Oct 2006	B1
7146422	Marlatt et al.	Dec 2006	B1
7295995	York et al.	Nov 2007	B1
7353252	Yang et al.	Apr 2008	B1
7353397	Herbach	Apr 2008	B1
7370274	Stuple et al.	May 2008	B1
7380218	Rundell	May 2008	B2
7386789	Chao et al.	Jun 2008	B2
7392249	Harris et al.	Jun 2008	B1
7395507	Robarts et al.	Jul 2008	B2
7406659	Klein et al.	Jul 2008	B2
7451389	Huynh et al.	Nov 2008	B2
7480715	Barker et al.	Jan 2009	B1
7487145	Gibbs et al.	Feb 2009	B1
7493560	Kipnes	Feb 2009	B1
7499919	Meyerzon et al.	Mar 2009	B2
7499940	Gibbs	Mar 2009	B1
7647312	Dai	Jan 2010	B2
7664786	Oh et al.	Feb 2010	B2
7685144	Katragadda	Mar 2010	B1
7685516	Fischer	Mar 2010	B2
7716236	Sidhu et al.	May 2010	B2
7734627	Tong	Jun 2010	B1
7756935	Gaucas	Jul 2010	B2
7761788	McKnight et al.	Jul 2010	B1
7769579	Zhao et al.	Aug 2010	B2
7774328	Hogue et al.	Aug 2010	B2
7779355	Erol et al.	Aug 2010	B1
7783965	Dowd et al.	Aug 2010	B1
7818678	Massand	Oct 2010	B2
7836044	Kamvar et al.	Nov 2010	B2
7917848	Harmon et al.	Mar 2011	B2
8020003	Fischer	Sep 2011	B2
8020112	Ozzie et al.	Sep 2011	B2
8027974	Gibbs	Sep 2011	B2
8051088	Tibbetts et al.	Nov 2011	B1
8086960	Gopalakrishna et al.	Dec 2011	B1
8091020	Kuppusamy et al.	Jan 2012	B2
8117535	Beyer et al.	Feb 2012	B2
8185448	Myslinski	May 2012	B1
8224802	Hogue et al.	Jul 2012	B2
8229795	Myslinski	Jul 2012	B1
8239751	Rochelle et al.	Aug 2012	B1
8260785	Hogue et al.	Sep 2012	B2
8261192	Djabarov	Sep 2012	B2
8346620	King et al.	Jan 2013	B2
8346877	Turner	Jan 2013	B2
8359550	Meyer et al.	Jan 2013	B2
8370275	Bhattacharya et al.	Feb 2013	B2
8386914	Baluja et al.	Feb 2013	B2
8434134	Khosrowshahi et al.	Apr 2013	B2
8453066	Ozzie et al.	May 2013	B2
8458046	Myslinski	Jun 2013	B2
8572388	Boemker et al.	Oct 2013	B2
8595174	Gao et al.	Nov 2013	B2
8621222	Das	Dec 2013	B1
8667394	Spencer	Mar 2014	B1
8782516	Dozier	Jul 2014	B1
8799765	MacInnis et al.	Aug 2014	B1
8856640	Barr et al.	Oct 2014	B1
8856645	Vandervort et al.	Oct 2014	B2
8904284	Grant et al.	Dec 2014	B2
20010025287	Okabe et al.	Sep 2001	A1
20020010725	Mo	Jan 2002	A1
20020029337	Sudia et al.	Mar 2002	A1
20020035714	Kikuchi et al.	Mar 2002	A1
20020069223	Goodisman et al.	Jun 2002	A1
20020070977	Morcos et al.	Jun 2002	A1
20020103914	Dutta et al.	Aug 2002	A1
20020129100	Dutta et al.	Sep 2002	A1
20020152255	Smith, Jr. et al.	Oct 2002	A1
20020161839	Colasurdo et al.	Oct 2002	A1
20020187815	Deeds et al.	Dec 2002	A1
20030046263	Castellanos et al.	Mar 2003	A1
20030058286	Dando	Mar 2003	A1
20030061200	Hubert	Mar 2003	A1
20030069877	Grefenstette et al.	Apr 2003	A1
20030156130	James et al.	Aug 2003	A1
20030172353	Cragun	Sep 2003	A1
20030200192	Bell et al.	Oct 2003	A1
20030234822	Spisak	Dec 2003	A1
20040061716	Cheung et al.	Apr 2004	A1
20040062213	Koss	Apr 2004	A1
20040122846	Chess et al.	Jun 2004	A1
20040139465	Matthews, III et al.	Jul 2004	A1
20040140901	Marsh	Jul 2004	A1
20040145607	Alderson	Jul 2004	A1
20040153973	Horwitz	Aug 2004	A1
20040164991	Rose	Aug 2004	A1
20040177319	Horn	Sep 2004	A1
20050024487	Chen	Feb 2005	A1
20050028081	Arcuri et al.	Feb 2005	A1
20050034060	Kotler et al.	Feb 2005	A1
20050039191	Hewson et al.	Feb 2005	A1
20050044132	Campbell et al.	Feb 2005	A1
20050044369	Anantharaman	Feb 2005	A1
20050055416	Heikes et al.	Mar 2005	A1
20050120308	Gibson et al.	Jun 2005	A1
20050144162	Liang	Jun 2005	A1
20050144573	Moody et al.	Jun 2005	A1
20050160065	Seeman	Jul 2005	A1
20050183001	Carter et al.	Aug 2005	A1
20050183006	Rivers-Moore et al.	Aug 2005	A1
20050198589	Heikes et al.	Sep 2005	A1
20050210256	Meier et al.	Sep 2005	A1
20050246653	Gibson et al.	Nov 2005	A1
20060005142	Karstens	Jan 2006	A1
20060010865	Walker	Jan 2006	A1
20060041836	Gordon et al.	Feb 2006	A1
20060047682	Black et al.	Mar 2006	A1
20060080303	Sargent et al.	Apr 2006	A1
20060106778	Baldwin	May 2006	A1
20060136552	Krane et al.	Jun 2006	A1
20060150087	Cronenberger et al.	Jul 2006	A1
20060190435	Heidloff et al.	Aug 2006	A1
20060213993	Tomita	Sep 2006	A1
20060248070	Dejean et al.	Nov 2006	A1
20070005581	Arrouye et al.	Jan 2007	A1
20070005697	Yuan et al.	Jan 2007	A1
20070033200	Gillespie	Feb 2007	A1
20070143317	Hogue et al.	Jun 2007	A1
20070150800	Betz et al.	Jun 2007	A1
20070156761	Smith	Jul 2007	A1
20070162907	Herlocker et al.	Jul 2007	A1
20070168355	Dozier et al.	Jul 2007	A1
20070198952	Pittenger	Aug 2007	A1
20070220259	Pavlicic	Sep 2007	A1
20070280205	Howell et al.	Dec 2007	A1
20070291297	Harmon et al.	Dec 2007	A1
20070294610	Ching	Dec 2007	A1
20080022107	Pickles et al.	Jan 2008	A1
20080028284	Chen	Jan 2008	A1
20080034213	Boemker et al.	Feb 2008	A1
20080059539	Chin et al.	Mar 2008	A1
20080077571	Harris et al.	Mar 2008	A1
20080082907	Sorotokin et al.	Apr 2008	A1
20080120319	Drews et al.	May 2008	A1
20080172608	Patrawala et al.	Jul 2008	A1
20080208969	Van Riel	Aug 2008	A1
20080239413	Vuong et al.	Oct 2008	A1
20080320397	Do et al.	Dec 2008	A1
20090006936	Parker et al.	Jan 2009	A1
20090013244	Cudich et al.	Jan 2009	A1
20090044143	Karstens	Feb 2009	A1
20090044146	Patel et al.	Feb 2009	A1
20090083245	Ayotte et al.	Mar 2009	A1
20090094178	Aoki	Apr 2009	A1
20090132560	Vignet	May 2009	A1
20090192845	Gudipaty et al.	Jul 2009	A1
20090198670	Shiffer et al.	Aug 2009	A1
20090204818	Shin et al.	Aug 2009	A1
20090282144	Sherrets et al.	Nov 2009	A1
20090292673	Carroll	Nov 2009	A1
20100070448	Omoigui	Mar 2010	A1
20100070881	Hanson et al.	Mar 2010	A1
20100076946	Barker et al.	Mar 2010	A1
20100100743	Ali et al.	Apr 2010	A1
20100121888	Cutting et al.	May 2010	A1
20100131523	Yu et al.	May 2010	A1
20100180200	Donneau-Golencer et al.	Jul 2010	A1
20100191744	Meyerzon et al.	Jul 2010	A1
20100198821	Loritz et al.	Aug 2010	A1
20100223541	Clee et al.	Sep 2010	A1
20100251086	Haumont et al.	Sep 2010	A1
20100268700	Wissner et al.	Oct 2010	A1
20100269035	Meyer et al.	Oct 2010	A1
20100275109	Morrill	Oct 2010	A1
20100281353	Rubin	Nov 2010	A1
20110016106	Xia	Jan 2011	A1
20110023022	Harper et al.	Jan 2011	A1
20110043652	King et al.	Feb 2011	A1
20110060584	Ferrucci et al.	Mar 2011	A1
20110072338	Caldwell	Mar 2011	A1
20110082876	Lu et al.	Apr 2011	A1
20110087973	Martin et al.	Apr 2011	A1
20110126093	Ozzie et al.	May 2011	A1
20110173210	Ahn et al.	Jul 2011	A1
20110179378	Wheeler et al.	Jul 2011	A1
20110191276	Cafarella et al.	Aug 2011	A1
20110209064	Jorgensen et al.	Aug 2011	A1
20110209075	Wan	Aug 2011	A1
20110219291	Lisa	Sep 2011	A1
20110225482	Chan et al.	Sep 2011	A1
20110225490	Meunier	Sep 2011	A1
20110252312	Lemonik et al.	Oct 2011	A1
20110276538	Knapp et al.	Nov 2011	A1
20110296291	Melkinov et al.	Dec 2011	A1
20110306028	Galimore	Dec 2011	A1
20120078826	Ferrucci et al.	Mar 2012	A1
20120084644	Robert et al.	Apr 2012	A1
20120095979	Aftab et al.	Apr 2012	A1
20120116812	Boone et al.	May 2012	A1
20120124053	Ritchford et al.	May 2012	A1
20120166924	Larson et al.	Jun 2012	A1
20120173960	Bennett	Jul 2012	A1
20120185473	Ponting et al.	Jul 2012	A1
20120203734	Spivack et al.	Aug 2012	A1
20120226646	Donoho et al.	Sep 2012	A1
20120233152	Vanderwende	Sep 2012	A1
20120254730	Sunderland et al.	Oct 2012	A1
20120284602	Seed et al.	Nov 2012	A1
20120304046	Neill et al.	Nov 2012	A1
20120317046	Myslinski	Dec 2012	A1
20130036344	Ahmed et al.	Feb 2013	A1
20130041685	Yegnanarayanan	Feb 2013	A1
20130132566	Olsen et al.	May 2013	A1
20130165086	Doulton	Jun 2013	A1
20130212062	Levy et al.	Aug 2013	A1
20130246346	Khosrowshahi et al.	Sep 2013	A1
20130268830	Khosrowshahi et al.	Oct 2013	A1
20140013197	McAfee et al.	Jan 2014	A1
20140032913	Tenenboym et al.	Jan 2014	A1
20140040249	Ploesser et al.	Feb 2014	A1
20140236958	Vaughn	Aug 2014	A1
20150012805	Bleiweiss et al.	Jan 2015	A1

Foreign Referenced Citations (2)

Number	Date	Country
WO2012057726	May 2012	WO
WO2014072767	May 2014	WO

Non-Patent Literature Citations (29)

Entry
Ganguly et al. “Query Expansion for Language Modeling using Sentence Similarities”, Jun. 2, 2011, CNGL, School of Computing, Dublin City University, Ireland, pp. 16.
Missen et al., “Comparing Semantic Associations in Sentences and Paragraphs for Opinion Detection in Blogs”, Medes 2009, Oct. 27-30, 2009, Lyon, France, Copyright 2008 ACM, pp. 6.
Bollegala et al., “Measuring Semantic Similarity between Words Using Web Search Engines”, May 8-12, 2007, Banff, Alberta, Canada, ACM 978, pp. 10.
“Bohman, P. ““Introduction to Web Accessibility””, Oct. 2003, ebAIM, printed Apr. 17, 2004,<http://www.webaim.org/intro/?templatetype=3> (p. 1-6)”.
“Caldwell et al., ““Web Content Accessibility Guidelines 2.0, W3C Working Draft Mar. 11, 2004””, Mar. 11, 2004, WorldWide Web Consortium (p. 1-56)”.
Francik, E., Computer-& screen -based interfaces: Universal design filter, Human Factors Engineering, Pacific Bell Version 2, Jun. 6, 1996.
Griesser, A., “A generic editor Full text,” pp. 50-55, 1997 ACM Press NewYork, NY, USA.
Jacobs, Ian, et al., “User Agent Accessibility Guidelines 1.0, W3C Recommendation Dec. 17, 2002”, World Wide Web Consortium, 115 pages.
Treviranus, Jutta, et al., “Authoring Tool Accessibility Guidelines 1.0, W3C Recommendation Feb. 3, 2000”, World Wide Web Consortium (p. 1-22).
Ashman. “Electronic Document Addressing: Dealing with Change.” ACM Computing Surveys, vol. 32, No. 3, Sep. 2000, pp. 201-212.
ISR and Written Opinion of the International Searching Authority in PCT Application No. PCT/US2011/037862, dated Oct. 31, 2011, 64 pages.
Electronic Signatures and Infrastructures ESI; PDF Advanced Electronic Signature Profiles; Part 4: PAdES Long Ter PAdES-LTV Profile, ETSI TS 102 778-4, V1.1.1, Jul. 2009, 19 pages.
Fox. “Maps API Blog: Creating Dynamic Client-side Maps Mashups with Google Spreadsheets.” Mar. 2007, [retrieved on Dec. 5, 2011]. Retrieved from the Internet: <URL:http://googlemapsapi.blogspot.com/2007/03/creating-dynamic-client-side-maps.html>. 2 pages.
GeekRant.org' [online]. “How to Embed a Word Document in Another Word Document,” Sep. 14, 2005, [retrieved on Dec. 5, 2011]. Retrieved from the Internet: <URL:http://www.geekrant.org/2005/09/14/word-embed-document/>. 6 pages.
Herrick. “Google this Using Google Apps for Collaboration and Productivity.” Proceedings of the ACM Siguccs Fall Conference on User Services Conference, Siguccs '09, Jan. 2009, p. 55.
https://en.wikipedia.org/wiki/Backus%E2%80%93Naur—Form, as of Jul. 14, 2013.
https://en.wikipedia.org/wiki/Regular—expression, as of Sep. 2, 2013.
Kappe. “Hyper-G: A Distributed Hypermedia System.” Proceedings of the International Networking Conference, 1993, [retrieved on Oct. 20, 2011]. Retrieved from the Internet: <URL:http://ftp.iicm.tugraz.at/pub/papers/inet93.pdf>. 9 pages.
Kircher. “Lazy Acquisition.” Proceedings of the 6th European Conference on Pattern Languages of Programs, Jul. 2011, pp. 1-11.
Microsoft Support' [online]. “How to Embed and Automate Office Documents with Visual Basic,” Mar. 27, 2007, [retrieved on Dec. 5, 2011]. Retrieved from the Internet: <URLhttp://support.microsoft.com/kb/242243>. 6 pages.
Microsoft Support' [online]. “OLE Concepts and Requirements Overview,” Oct. 27, 1999, [retrieved on Dec. 2, 2011]. Retrieved from the Internet: <URL:http://support.microsoft.com/kb/86008>. 3 pages.
Oracle Provider for OLE DB—Developer's Guide. 10g Release 1 (10.1) Dec. 2003, Oracle Corp., 90 pages.
Pinkas et al. “CMS Advanced Electrponic Signatures,” Request for Comments 5126, Feb. 2008, 142 pages.
WebArchive' [online]. “Supplementary Notes for MFC Programming Module 23 and Module 27: Interfaces, com.com + and OLE” in: http://www.tenouk.com/visualcplusmfc/mfcsupp/ole.html, Jan. 6, 2008, [retrieved on Dec. 5, 2011]. Retrieved from the Internet: <URL:http://web.archive.org/web/20091125073542/http://www.tenouk.com/visualcplusmfc/mfcsupp/ole.html>. 4 pages.
Jourdan, Guy-Vincent, CSI 3140 WWW Structures, Techniques and Standards, Cascading Style Sheets, power point slides, published Feb. 16, 2010.
W3C, Cascading Style sheets Level 2 Revision 1 Specification, Apr. 15, 2011, 487 pages.
David Sawyer McFarland, “CSS the missing manual”, O'Reilly, Aug. 2009, pp. 7-101, 134-138, 428-429.
Herb Tyson, Microsoft Word 2010 Bible, John Wiley & Sons, pp. 221, 757, 833, Jun. 21, 2010.
Timestamp from Wikipedia, accessed from https://en.wikipedialcorg/wiki/Timestamp, archived by WaybackMachine on Sep. 15, 2012, pp. 1-2.

Methods for automatic footnote generation

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

CPC

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (257)

Foreign Referenced Citations (2)

Non-Patent Literature Citations (29)