1. Field of Art
This invention pertains, in general, to scoring similar passages in digital text documents and, in particular, to ranking similar passages based on characteristics of the similar passages occurring in the digital text documents.
2. Description of the Related Art
Advancement in digital technology has changed the way people acquire information. For example, people can now view electronic documents that are stored in a predominantly text corpus such as a digital library that is accessible via the Internet. Such a digital text corpus is established, for example, by scanning paper copies of documents including books and newspapers, and then applying an optical character recognition (OCR) process to produce computer-readable text from the scans. The corpus can also be established by receiving documents and other texts already in machine-readable form.
Many of these electronic documents contain similar passages or quotations that appear multiple times within the corpus. Users may search for documents in the digital corpus based on various search queries. Additionally, users may search for the documents based on known or popular quotations or phrases contained in the documents. However, these types of searches may yield thousands of matching results and the most relevant results may not initially be displayed making it difficult for users to locate the documents or passages most relevant to their queries.
The problems described above are addressed by a computer-implemented method, computer program product, and computer system for calculating a score for a passage having a plurality of instances occurring in a digital corpus. Embodiments of the method comprise calculating at least one score based at least in part on characteristics of instances of the passage occurring in the digital corpus and generating a ranking score associated with the passage based at least in part on the calculated at least one score. The method further comprises storing the ranking score in association with the passage in a computer-readable medium. Embodiments of the computer program product and computer system comprise computer code for performing similar functions.
The features and advantages described in the specification are not all inclusive and, in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the disclosed subject matter.
The disclosed embodiments have other advantages and features which will be more readily apparent from the detailed description, the appended claims, and the accompanying figures (or drawings). A brief introduction of the figures is below.
The figures depict various embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
The Figures (FIGS.) and the following description describe embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.
Not all the entities shown in
The data store 110 stores the corpus 112 of information and the similar passage database 114. It also stores data utilized to support the functionalities or generated by the functionalities described herein. The data store 110 can also store other corpora and data. The data store 110 receives requests for information stored in it and provides the information in return. In a typical embodiment, the data store 110 is comprised of multiple computers and/or storage devices configured to collectively store a large amount of information.
The corpus 112 stores a set of information. In one embodiment, the corpus 112 stores the contents of a large number of digital documents. As used herein, the term “document” refers to a written work or composition. This definition includes, for example, conventional books such as published novels, and collections of text such as newspapers, news stories, magazines, journals, pamphlets, letters, articles, web pages and other electronic documents. The document contents stored by the corpus 112 include, for example, the document text represented in a computer-readable format, images from the documents, scanned images of pages from the documents, etc. As used herein, the term “word” refers to a token containing a block of structured text. The word does not necessarily have meaning in any language, although it will have meaning in most cases.
In addition, the corpus 112 stores metadata about the documents within it. The metadata are structured data that describe the documents. Examples of metadata include metadata about a book such as the author, publisher, year published, number of pages, edition, and libraries that carry the book. The metadata stored in the corpus is associated with the similar passages stored in the similar passage database 114.
The similar passage database 114 stores data describing similar passages in the corpus 112. The similar passage database 114 also stores the ranking score of the similar passage once a ranking score is assigned by the scoring engine 128. More details describing the function of the scoring engine 128 are provided below.
As used herein, the phrase “similar passage” refers to a passage in a source document that is found in a similar form in one or more different target documents. Occurrences of the same similar passage are referred to as “instances” of that passage. Oftentimes, the similar passage instances are identical. Nevertheless, the passages are referred to as “similar” because there might be slight differences among the passage instances in the different documents. When a source document is said to have multiple “similar passages,” it means that multiple passages in the source document are also found in other documents. This phrase does not necessarily mean that the “similar passages” within the source document are similar to each other. Similar passages are also referred to as “quotations,” “shared passages,” “popular passages,” and “related passages.”
In one embodiment, the passage database 114 is generated by the passage mining engine 116 to store information obtained from passage mining. In some embodiments, the passage mining engine 116 constructs the passage database 114 by copying existing quotation collections such as Bartlett's, and searching and indexing the instances of quotations and their variations that appear in the corpus 112. In some embodiments, the passage mining engine 116 constructs the passage database 114 by copying existing text appearing in a quoted form, such as delimited by quotation marks, from the corpus, and searching and indexing the instances of the text in the corpus 112. Further, in some embodiments the passage mining engine 116 constructs the passage database 114 by copying each group of words, such as sentences, from the corpus, and searching and indexing the instances of the group of words in the corpus 112. In one embodiment, the database 114 stores similar passages, document identifiers (Doc IDs) identifying the documents in which the passages exist, position identifiers (Pos IDs) identifying the location in the documents at which the passages appear, passage ranking results, etc. Further, in some embodiments, the database 114 also stores the documents or portions of the documents that have the similar passages.
The passage mining engine 116 includes one or more computers adapted to analyze the texts of documents in the corpus 112 in order to identify similar passages. For example, the passage mining engine 116 may find that the passage “I read somewhere that everybody on this planet is separated by only six other people” from the book “Six Degrees of Separation” by John Guare, also appears in 13 other books published between 2000 and 2006. The passage mining engine 116 may store, in the similar passage database 114, the passage, its location in the “Six Degrees of Separation” book, Doc IDs of the 13 other books, Pos IDs indicating the locations of the passage instances in the 13 other books, and its ranking relative to other similar passages in the “Six Degrees of Separation” book or relative to other similar passages in the corpus 112. More detail regarding the passage mining engine 116 is described in the related application, U.S. patent application Ser. No. 11/781,213, filed Jul. 20, 2007, and titled “Identifying and Linking Similar Passages in a Digital Text Corpus.” Passage mining may be performed off-line, asynchronously of any queries made by the client 118 against the data store 110. In one embodiment, the passage mining engine 116 runs periodically to process all the text information in the corpus 112 from scratch and generate similar passage data for storing in the similar passage database 114, disregarding any information obtained from prior passage mining. In another embodiment, the passage mining engine 116 is used periodically to incrementally update the data stored in the similar passage database 114, for example, as new documents are added to the corpus 112.
The scoring engine 128 includes one or more computers adapted to assign scores to the similar passages identified by the passage mining engine 116 and stored in the similar passages database 114. In one embodiment, the scoring engine 128 analyzes the characteristics of the similar passages and the documents containing the similar passages stored in the similar passage database 114 and assigns ranking scores to the similar passages. Scoring may be performed on-line when the scoring engine is connected to network 122 and may also be performed off-line, asynchronously of any queries made by client 118 against the data store 110. In one embodiment, the scoring engine 128 runs periodically to process all of the content from the data store 110 from scratch and assigns a score associated with a similar passage for storing in the similar passage database 114. In another embodiment, scoring engine 128 is used periodically to incrementally update the ranking information stored in the similar passage database 114, for example, as new similar passages are found and added to the similar passage database.
The ranking engine 130 ranks a set of similar passages to be displayed on the client 118. The ranking engine 130 ranks the set of similar passages based on the associated ranking scores of the similar passages. The set of similar passages can be displayed on the client 118 in the ranked order.
For purposes of illustration,
In one embodiment, the client 118 is an electronic device having a web browser for interacting with the web server 120 via the network 122, and it is used by a human user to access and obtain information from the data store 110. It can be, for example, a notebook, desktop, or handheld computer, a mobile telephone, personal digital assistant (PDA), mobile email device, portable game player, portable music player, computer integrated into a vehicle, etc.
The web server 120 interacts with the client 118 and the ranking engine 130 to provide information from the data store 110. In one embodiment, the web server 120 includes a User Interface (UI) module 124 that communicates with the client's 118 web browser to receive and present information. The web server 120 also includes a searching module 126 that searches for information in the data store 110. For example, the UI module 124 may receive a query from the web browser issued by a user of the client 118, and the searching module 126 may execute the query against the corpus 112 and the similar passage database 114, and retrieve information including similar passages information that satisfies the query. The similar passages are displayed and listed in accordance with a ranking order provided by the ranking engine 130.
The network 122 represents communication pathways between the data store 110, passage mining engine 116, client 118, web server 120, the scoring engine 128, and the ranking engine 130. In one embodiment, the network 122 is the Internet. The network 122 can also utilize dedicated or private communications links that are not necessarily part of the Internet. In one embodiment, the network 122 uses standard communications technologies, protocols, and/or interprocess communications techniques. Thus, the network 122 can include links using technologies such as Ethernet, 802.11, integrated services digital network (ISDN), digital subscriber line (DSL), asynchronous transfer mode (ATM), etc. Similarly, the networking protocols used on the network 122 can include the transmission control protocol/Internet protocol (TCP/IP), the hypertext transport protocol (HTTP), the simple mail transfer protocol (SMTP), the file transfer protocol (FTP), the short message service (SMS) protocol, etc. The data exchanged over the network 122 can be represented using technologies and/or formats including the hypertext markup language (HTML), the extensible markup language (XML), etc. In addition, all or some of links can be encrypted using conventional encryption technologies such as the secure sockets layer (SSL), HTTP over SSL (HTTPS), and/or virtual private networks (VPNs). In another embodiment, the nodes can use custom and/or dedicated data communications technologies instead of, or in addition to, the ones described above.
The processor 202 may be any general-purpose processor such as an INTEL x86 compatible-CPU. The storage device 208 is any device capable of holding data, like a hard drive, compact disk read-only memory (CD-ROM), DVD, or a solid-state memory device. The memory 206 holds instructions and data used by the processor 202 and may be, for example, firmware, read-only memory (ROM), non-volatile random access memory (NVRAM), and/or RAM. The pointing device 214 may be a mouse, track ball, or other type of pointing device, and is used in combination with the keyboard 210 to input data into the computer system 200. The graphics adapter 212 displays images and other information on the display 218. The network adapter 216 couples the computer system 200 to the network 122.
As is known in the art, the computer 200 is adapted to execute computer program modules. As used herein, the term “module” refers to computer program logic and/or data for providing the specified functionality. A module can be implemented in hardware, firmware, and/or software. In one embodiment, the modules are stored on the storage device 208, loaded into the memory 206, and executed by the processor 202 as one or more processes.
The types of computers used by the entities of
Embodiments of the entities described herein can include other and/or different modules than the ones described here. In addition, the functionality attributed to the modules can be performed by other or different modules in other embodiments. Moreover, this description occasionally omits the term “module” for purposes of clarity and convenience.
The characteristics analysis module 302 analyzes characteristics associated with a similar passage and its similar passage instances in order to produce a total score. Characteristics that are analyzed include characteristics associated with the passage or passage instance itself and characteristics associated with the usage of the similar passage in the digital corpus 112. Examples of such characteristics are the number of words in the passage, the author of the document which contains the similar passage instance, the publisher of the document which contains the similar passage instance, the characteristics of the words introducing and following the similar passage, how frequently the similar passage appears in the digital corpus, the length of the similar passage, the words of the similar passage, the usage of punctuation associated with the similar passage, and the diffusion of the similar passage in the digital corpus. The diffusion of the similar passage is determined by analyzing the variation of the authors of the documents in which the instances of the passage appear, the variation of the publishers of the documents in which the similar passage instances appear, the variation of the libraries that carry the documents in which the similar passage instances appear, and/or the variation of the parts of the documents in which the similar passage instances appear.
In one embodiment, the author associated with the document which contains a similar passage instance is identified and examined by the characteristics analysis module 302. In some embodiments, the characteristics analysis module 302 compares the identified author to a list or database of previously-identified famous or known authors. In one embodiment, each author in the list or database has an associated score. In such embodiments, when the characteristics analysis module 302 compares the identified authors to the list or database, and the identified author is found therein, the module 302 assigns the score associated with that author to the similar passage instance. If the identified author is not found, the module 302 assigns a low score or a score of zero to the similar passage instance. In some embodiments, the authors in the list or database do not have an associated score. In those embodiments, the module 302 assigns a score to the similar passage instance based on whether the identified author was found in the database. The assigned score is represented by A(Q).
In some embodiments, the list or database of previously-identified famous or known authors may be based on authors found in a printed encyclopedia, an online encyclopedia, such as Wikipedia, or other sources such as Bartlett's.
In one embodiment, frequency of appearance of the similar passage, or the number of similar passage instances in the digital corpus 112, is a characteristic that is examined. The characteristics analysis module 302 examines and identifies the frequency of appearance of the similar passage in the digital corpus 112. If the similar passage appears in fewer documents, the characteristics analysis module 302 assigns a lower score to that similar passage. If the similar passage appears in many documents, the characteristics analysis module 302 assigns a higher score to that similar passage.
In some embodiments, there are certain similar passages that tend to appear very frequently and the characteristics analysis module 302 adjusts the score downward as a result. For example, a cliché or overused slogan may be identified as a similar passage and may be very prevalent throughout the digital corpus 112. In those instances, the cliché or slogan may be assigned a lower score because the high frequency of occurrence does not necessarily indicate that the passage has great significance.
In some embodiments, the length of the similar passage may be a factor in determining a score based on the frequency of appearance of the similar passage. For example, a very short similar passage (for example, one that including less than five or six words) may appear frequently. However, since this passage is shorter than the average length of a passage, it is assigned a lower score. Conversely, if the similar passage is long (for example, more than ten words in length), it would still be assigned a high score if the frequency of appearance of the similar passage within the digital corpus 112 is high. In one embodiment, the score associated with the frequency of appearance of the similar passage in the digital corpus 112 is represented by F(Q).
In one embodiment, the length of the similar passage is a characteristic that is separately examined and scored by the characteristics analysis module 302. The characteristics analysis module 302 assigns a lower score to a very short passage (for example, one that including less than five or six words) and assigns a higher score to a long passage (for example, more than ten words in length). In one embodiment, the score associated with the length of the similar passage in the digital corpus 112 is represented by L(Q).
In one embodiment, the variation of words and grammar of the similar passage are characteristics that are examined. The characteristics analysis module 302 examines the words of the similar passage and assigns a score to the similar passage in response. The characteristics analysis module 302 assigns a lower score to a similar passage that contains repeating words or numbers and assigns a higher score to a passage that contains few repeating words or numbers. In some embodiments, if the similar passage is a chart, or another table-like presentation of words (i.e. words with no verbs), then the characteristics analysis module 302 assigns a lower score to that similar passage.
In some embodiments, the characteristics analysis module 302 applies one or more language models to analyze the words of the similar passage. For example, language models may be used to determine whether the words of the similar passage demonstrate usage of proper grammar or whether the words contain too many numbers. In such embodiments, a high score is assigned to a passage that demonstrates use of proper grammar and a low score is assigned to a passage that demonstrates use of improper grammar. Additionally, the score of a passage that contains too many numbers is lowered. In one embodiment, the score associated with the word analysis of the similar passage in the digital corpus is represented by W(Q).
In one embodiment, the usage of punctuation associated with the similar passage is identified and examined by the characteristics analysis module 302. For example, the use of quotation marks surrounding a similar passage is an indication that the similar passage is a quotation and therefore the passage is assigned a higher score. In one embodiment, the score associated with the use of punctuation marks is represented by P(Q).
In one embodiment, the document that contains a similar passage instance is a characteristic that is identified and examined by the characteristics analysis module 302. Similar to the analysis of the author of the document, the characteristics analysis module 302 compares the identified document to a list or database of previously-identified famous or known documents. In one embodiment, each document in the list or database has an associated score. In such embodiments, when the characteristics analysis module 302 compares the identified document to the list or database of documents, and the identified document is found therein, the module 302 assigns the score associated with that document to the similar passage instance. If the identified document is not found in the database, the module 320 assigns a low score or a score of zero. In some embodiments, the documents in the list or database do not have associated scores. In those embodiments, the module 302 assigns a score to the similar passage instance based on whether the identified document was found therein. In one embodiment, the assigned score is represented by B(Q).
In one embodiment, the set of words introducing a similar passage and the set of words following a similar passage is a characteristic that is examined. In some embodiments, these words are known as speech acts. For example, words such as “Person X says” or “Person X wrote” are indications that a similar passage is to follow. As another example, speech acts, such as “said Person X” are indications that a similar passage appeared before the exemplary speech act phrase. A higher score is assigned to a similar passage that is introduced by or followed by a speech act. In one embodiment, the assigned score is represented by S(Q).
In one embodiment, a diffusion of the similar passage in the digital corpus 112 is examined by the characteristics analysis module 302. In one embodiment, the assigned score is represented by D(Q) and is calculated by first calculating entropy scores as explained below.
In one embodiment, the variation of the authors, or number of different authors, of the documents containing a particular similar passage is a component of the diffusion score. The characteristics analysis module 302 examines the authors of the documents containing the instances of a particular similar passage in order to determine the number of different authors. The characteristics analysis module 302 assigns a higher score to a similar passage that is associated with many different authors, and assigns a lower score to a similar passage that is associated with fewer different authors. In one embodiment, the score is calculated using the following entropy equation:
As shown in the exemplary equation above, the entropy of the authors (E(A)), is calculated by taking the negative summation of the product of p(x) and the log of p(x), where p(x) is the probability that author x will occur in a given set of examined documents and is expressed as a fraction. For example, when calculating E(A), the individual probabilities correspond to the probability that a particular author will appear as an author of a document among the set of examined documents containing a particular similar passage. Using the equation above, if ten documents containing instances of a particular similar passage were examined and all ten documents were associated with the same author, p(x) would be one, and the entropy of the author (E(A)) would be zero. However, if some of the documents were associated with different authors, the entropy of the author (E(A)) would be greater than zero. If a large number of documents were examined and all the documents were associated with different authors, the value of the entropy of the authors would be high. For example, if ten documents were examined and ten authors were identified (each document corresponding to a different author), p(x)*log2(p(x)) for each author is −0.3322 and the negative summation is 3.322.
In one embodiment, the variation of the publishers of the documents associated with the particular similar passage is a component of the diffusion score. The publishers of the documents containing instances of the particular similar passage are examined and identified. Similar to the calculation for authors, the characteristics analysis module 302 calculates an entropy of the publishers (E(P)) by using a formula similar to the one above, but in this case p(x) corresponds to the probability of the occurrence of a particular publisher. Therefore, similar to the analysis of the authors, the characteristics analysis module 302 assigns a higher score to a similar passage that is associated with many different publishers, and assigns a lower score to a similar passage that is associated with fewer different publishers.
In one embodiment, the variation of the libraries that carry copies of the documents containing instances of the particular passage is a component of the diffusion score that is identified by the characteristics analysis module 302. Similar to the calculation for authors and publishers, the characteristics analysis module 302 calculates an entropy of the libraries (E(L)). In this case, p(x) corresponds to the probability of the appearance of a particular library that carries a copy of a document containing a particular similar passage. Therefore, similar to the analysis of the authors and publishers, the characteristics analysis module 302 assigns a higher score to a similar passage that is appears in a document that is held in a collection of many different libraries, and assigns a lower score to a similar passage that appears in a document that is held in a collection of fewer different libraries.
In one embodiment, the variation of the parts of documents in which the similar passage instances appear is a component of the diffusion score. The characteristics analysis module 302 examines and identifies parts of the documents in which the similar passage appears. In some embodiments, a document is divided into a number of parts. For example, a document may be divided into three parts: a first third (the beginning part of the document), a second third (the middle part of the document), and a last third (the end part of the document). Among the documents containing the similar passage instances, the characteristics analysis module 302 makes a determination as to which parts of the documents the similar passage instances appear. Similar to the calculations above, the characteristics analysis module 302 calculates an entropy of the parts of the documents (E(Q)) using a similar formula. In this case, the p(x) corresponds to the probability of the appearance of a passage instance in a particular part of a document. Therefore, the characteristics analysis module 302 assigns a higher score to a similar passage that appears in different parts of documents, and assigns a lower score to a similar passage that appears in the same part, or mostly the same part, of the documents.
The characteristics analysis module 302 combines the entropies calculated above (E(A), E(P), E(L), and E(Q)) in order to calculate a total diffusion (D(Q)) of the similar passage throughout the corpus. Depending upon the embodiment, the characteristics analysis module 302 calculates D(Q) as a sum of its components, as a weighted linear combination, as a weighted geometric mean or using another technique. The characteristics analysis module 302 assigns the total diffusion score D(Q) to the similar passage. In some embodiments, the total diffusion score is stored in association with the similar passage in the similar passage database 114.
An embodiment of the score calculation module 306 combines the individual scores described above (A(Q), F(Q), L(Q), W(Q), P(Q), B(Q), S(Q), and D(Q)) to determine the total score assigned to a similar passage. In one embodiment, the total score is calculated by summing the individual scores. In some embodiments, certain individual characteristics are more important or more relevant than others. Therefore, the characteristics analysis module 302 weights scores for certain characteristics more than scores for other characteristics. In some embodiments, the total score is determined by a weighted linear combination of the individual scores. In other words, each individual score is assigned a weight and is multiplied by its assigned weight to yield a weighted score. The weighted scores are summed in order to yield the total score. In other embodiments, the total is determined by a weighted geometric mean. In other words, each score is assigned a weight. Each score is then raised to the power of the weight to yield a weighted score. The weighted scores are then multiplied together to yield the total score. In some embodiments, the sum of the weights equals one. Therefore, if one weight is increased by a certain amount the total of the other weights is decreased by the same amount such that the sum of the weights remains one.
The total score serves as the ranking score for the passage. In some embodiments, the score calculation module 306 aggregates a subset of the scores described above to produce the ranking score for a similar passage. Information about the similar passage and its associated ranking score are stored in the similar passage database 114.
The scoring engine 128 receives 402 a set of similar passage instances for a passage in the digital corpus 112 to be analyzed. The scoring engine 128 calculates 404 the individual scores (A(Q), F(Q), L(Q), W(Q), P(Q), B(Q), S(Q), and D(Q)) for the examined characteristics. The scoring engine 128 then determines 406 a ranking score for the identified passage. In one embodiment, the individual scores are summed in order to produce a total score that serves as the ranking score for the identified passage. The scores can also be combined using one or more of the weighting techniques described above. The ranking score is associated with the passage and stored 408 in the similar passage database 114. This process can be performed for each similar passage in the similar passage database 114.
A client device 118 sends 502 a request to the web server 120. The request from the client device 118 may be a search query entered by a user. In some embodiments, the request from the client device 118 may be created when the user selects a hypertext link presented on the client device. The web server 120 receives 504 the request and determines 506 a set of results from the similar passage database 114. The set of results is a set of similar passages. The ranking engine 130 ranks 508 the similar passages based on the ranking scores associated with the similar passages, thereby determining the order in which to display the similar passages. The search results are received 510 by the client device 118 and displayed 512 in the ranked order.
In
As used herein any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
In addition, use of the “a” or “an” are employed to describe elements and components of the embodiments herein. This is done merely for convenience and to give a general sense of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise.
Upon reading this disclosure, those of skill in the art will appreciate still additional alternative structural and functional designs for a system and a process for ranking similar passages through the disclosed principles herein. Thus, while particular embodiments and applications have been illustrated and described, it is to be understood that the disclosed embodiments are not limited to the precise construction and components disclosed herein. Various modifications, changes and variations, which will be apparent to those skilled in the art, may be made in the arrangement, operation and details of the method and apparatus disclosed herein without departing from the spirit and scope defined in the appended claims.
This application claims the benefit of U.S. Patent Provisional Application No. 60/956,880, filed Aug. 20, 2007, the contents of which are hereby incorporated by reference. This application is related to U.S. patent application Ser. No. 11/781,213, filed Jul. 20, 2007, and titled “Identifying and Linking Similar Passages in a Digital Text Corpus,” the contents of which are hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
60956880 | Aug 2007 | US |