Not Applicable
Not Applicable
This invention relates to word processing, text processing, and information search.
The term “text morphing,” as used herein, is the integration or blending together of substantive content from two or more bodies of text into a single body of text based on locations of linguistic commonality among the two or more bodies of text. In some respects, this “text morphing” may be viewed as the text-based version of “image morphing” in which two or more images are integrated or blended together based on locations of image subject commonality among the two or more images. The meaning of the term “text morphing” as used herein is different from its occasional use in the prior art in reference to incremental video-graphic transition of text letters from one word (or phrase) to another word (or phrase).
The method of text morphing that is disclosed has several useful applications. For example, text morphing can synthesize novel concepts and expressions that, when interacted with human imagination, can create useful ideas, creative works, and products. Sometimes a stroke of genius comes from combining diverse concepts in a way that no one has done before and sometimes these combinations are serendipitous. An author or inventor who is uninspired when staring at blank page or computer screen (as in “writer's block”) may be inspired to creative achievement by reading a text-morphed composition. As another application, text morphing may prove useful in the next generation of search methods. A search method that integrates and synthesizes information across multiple sources can provide more useful search results than a search method that is just limited to separate ranking and listing of individual sources. Also, as is the case with image morphing, text morphing may serve the purpose of entertaining and amusing people.
There are many interesting methods in the prior art for processing text from single and multiple text-based sources. However, none of these methods disclose morphing together substantive content from two or more text-based sources as is done by the invention that is disclosed herein. As an organizing construct for this review, text-processing methods may be classified into four general categories: (1) methods to create a summary of a single source; (2) methods to modify a single document by phrase substitution; (3) methods to combine content from multiple sources using templates; and (4) methods to combine content from multiple sources without templates. We now discuss these general method categories, including their limitations and some examples thereof.
1. Methods to Create a Summary of a Single Source
There are methods in the prior art to create a summary (or an abstract or targeted excerpt) of a single text-based source. These methods can also be applied to multiple text-based sources to create a separate summary for each of several sources. Such methods are useful for a variety of applications, including creating document summaries for research review purposes or for display of search engine results. However, such methods do not morph together content between two or more text-based sources. Examples in the prior art that appear to use such document-summarizing methods include the following U.S. Pat. Nos.: 6,865,572 (Boguraev et al., 2005; “Dynamically Delivering, Displaying Document Content as Encapsulated Within Plurality of Capsule Overviews with Topic Stamp”); 7,292,972 (Lin et al., 2007; “System and Method for Combining Text Summarization”); and 7,587,309 (Rohrs et al., 2009; “System and Method for Providing Text Summarization for Use in Web-Based Content”).
2. Methods to Modify a Single Document by Phrase Substitution
There are methods in the prior art to modify a single document by selectively substituting alternative phrases (single words or multiple word combinations) for the phrases that were originally used in the document. For example, the alternative phrases may be similar in meaning, but different in style or complexity, as compared to the original phrases used in the document. Such methods are useful for a variety of applications, including rewriting documents for different audiences or purposes. However, such methods do not morph together substantive content between two or more text-based sources.
Examples in the prior art that appear to use phrase substitution methods include the following U.S. Pat. Nos.: 4,456,973 (Carlgren et al., 1984; “Automatic Text Grade Level Analyzer for a Text Processing System”); 4,773,039 (Zamora, 1988; “Information Processing System for Compaction and Replacement of Phrases”); 7,113,943 (Bradford et al., 2006; “Method for Document Comparison and Selection”); 7,472,343 (Vasey, 2008; “Systems, Methods and Computer Programs for Analysis, Clarification, Reporting on and Generation of Master Documents for Use in Automated Document Generation”); 7,599,899 (Rehberg et al., 2009; “Report Construction Method Applying Writing Style and Prose Style to Information of User Interest”); 7,627,562 (Kacmarcik et al., 2009; “Obfuscating Document Stylometry”); and 7,640,158 (Detlef et al., 2009; “Automatic Detection and Application of Editing Patterns in Draft Documents”). Such examples also appear to include U.S. patent applications: 20070100823 (Inmon, 2007; “Techniques for Manipulating Unstructured Data Using Synonyms and Alternate Spellings Prior to Recasting as Structured Data”); 20090094137 (Toppenberg et al., 2009; “Web Page Optimization Systems”); 20090217159 (Dexter et al., 2009; “Systems and Methods of Performing a Text Replacement Within Multiple Documents”); and 20090313233 (Hanazawa, 2009; “Inspiration Support Apparatus Inspiration Support Method and Inspiration Support Program”).
3. Methods to Combine Content from Multiple Sources using Templates
There are methods in the prior art that use templates to combine content from multiple text-based sources into a single standard-format report or some other standardized document. For example, a standardized sales report may be created by extracting sales information from multiple sources to “fill in the blanks” of a template for a standardized sales report. There are many useful applications for such methods, but they are limited to the particular subject domains for which templates are created. They do not provide a generalizable, flexible method for morphing together content between two or more text-based sources across a wide variety of subject domains and applications. Examples in the prior art that appear to use templates to combine content from multiple text-based sources include: U.S. Pat. Nos. 7,627,809 (Balinsky, 2009; “Document Creation System and Related Methods”), 7,689,899 (Leymaster et al., 2010; “Methods and Systems for Generating Documents”), and 7,721,201 (Grigoriadis et al., 2010; “Automatic Authoring and Publishing System”); as well as U.S. patent application 20100070448 (Omoigui, 2010; “System and Method for Knowledge Retrieval, Management, Delivery and Presentation”).
4. Methods to Combine Content from Multiple Sources without Templates
There are methods in the prior art that combine, to some extent, content from multiple text-based sources in some fashion without using a template. U.S Pat. No. 5,953,718 (Wical, 1999; “Research Mode for a Knowledge Base Search and Retrieval System”) uses point of view “gists” from different documents to create a synopsis. U.S Pat. No. 6,847,966 (Sommer et al., 2005; “Method and System for Optimally Searching a Document Database Using a Representative Semantic Space”) uses “pseudo-document vectors” to represent hypothetical documents. U.S. Pat. No. 7,366,711 (McKeown et al., 2008; “Multi-Document Summarization System and Method”) performs temporal processing on phrases from different documents in order to generate a summary. U.S. Pat. No. 7,548,913 (Ekberg et al., 2009; “Information Synthesis Engine”) organizes excerpts from, and hyperlinks to, different documents. U.S. Patent Application 20090193011 (Blair-Goldensohn et al., 2009; “Phrase Based Snippet Generation”) generates a snippet with a plurality of sentiments about an entity from different review sources. U.S. Patent Application 20090292719 (Lachtarnik et al., 2009; “Methods for Automatically Generating Natural-Language News Items from Log Files and Status Traces”) automatically generates natural-language news items from log files. These are interesting and useful methods. However, none of these methods flexibly morphs together the substantive content of two or more text-based sources as does the invention that we will now disclose herein.
This invention is a multi-stage method for “text morphing,” wherein text morphing involves integrating or blending together substantive content from two or more bodies of text into a single body of text based on locations of linguistic commonality among the two or more bodies of text. This method for multi-stage text morphing spans four stages of text morphing. First-stage text morphing is substitution of phrase synonyms between two bodies of text. This changes text style, but does not significantly change text meaning. Second-stage text morphing is substitution, between two bodies of text, of text segments with synonymous starting phrases and synonymous ending phrases.
This second stage is analogous, in some respects, to splicing different gene segments that have compatible starting and ending sequences, but different middle sequences. This begins to morph meaning in addition to style. The third and fourth stages of text morphing involve substitution, between two bodies of text, of phrases or segments using associations within a larger reference body of text. These latter stages substantially morph together the content of two or more bodies of text.
These figures show different examples of how this invention may be embodied.
However, these examples are not exhaustive and these figures do not limit the full generalizability of the claims.
An explanation of the arrows linking phrases is given in the symbol key at the bottom of the page with
In an example, one way to identify phrases that are synonyms is by using a database of phrase synonyms. There are several databases of synonyms in the prior art, including those integrated into common word processing software and publically-available datasets created by university researchers. There are also several methods in the prior art for creating databases of phrase synonyms. In light of this prior art, and since the particular method for selection or creation of a database of synonyms is not central to this invention, a particular database is not specified herein.
Phrase synonyms may be clustered into sets. A set of phrase synonyms may be bidirectionally substitutable—meaning that any phrase within the set can be substituted for any other phrase in the set, without creating significant changes in meaning or grammatical errors. Alternatively, a set of phrase synonyms may be only unidirectionally substitutable—meaning that there is at least one phrase in the set for which all other phrases in the set may be substituted, without creating significant changes in meaning or grammatical errors. For example, substitution of an acronym for a multi-word phrase is unidirectional if the acronym can stand for different multi-word combinations. Either bidirectional or unidirectional sets of phrase synonyms may be used in this method, as long as proper directionality of phrase substitution is maintained when unidirectional substitutions are done.
Stage-one morphs text style from Text A into Text B, but it does not significantly change the meaning of Text A. In this respect, stage-one text morphing is relatively non-intrusive. One could stop text morphing after stage-one without continuing to advanced stages. An advantage of stopping after stage-one morphing is that this largely preserves the logic, meaning, and grammar of Text A. However, stopping after stage-one does not significantly morph together the content of texts A and B. Thus, it does not generate novel concepts that can spark human imagination and invention. This limitation is why the method described herein is a multi-stage method that includes, at a minimum, a second stage of text morphing after this first stage.
In this embodiment of the invention, text content is morphed unidirectionally, from Text A into Text B. The labels “A” and “B” are arbitrary and can be reversed, so text content could be morphed unidirectionally from Text B into Text A, as long as one switches the labels “A” and “B” throughout the specification and claims. In a substantively different example, text can be morphed bidirectionally by switching phrases in both directions, not just one, as long as the synonym substitutions that are made are bidirectional synonym pairs. Text can be morphed from Text B into Text A at the same time that text is morphed from Text A into Text B. Such bidirectional text morphing creates two morphed texts, one that starts with Text A and morphs toward Text B, and one that starts with Text B or morphs toward Text A. In a third example of morphing directionality, the direction of substitution between A and B can be randomized across phrase pairs.
It is important to note that in stage-two morphing, although the end portions of these pairs of text segments are synonyms, their middle portions are not. For example, phrase 303 in Text A is not a synonym of phrase 310 in Text B. Similarly, phrase 306 in Text A is not synonym of phrase 313 in Text B. Thus, substituting such segments in stage two changes the meaning, not just the writing style, of a body of text.
In some respects, stage-two morphing is analogous to genetic engineering that involves splicing and substituting gene sequences in which the end-portions of different gene segments are compatible for linking, but their middle portions are different. In genetic engineering, different gene segments with compatible end portions are spliced and substituted to create new organisms. A new organism created in this manner may, or may not, be functional and useful. In second-stage text morphing, different text segments with compatible end portions are spliced and substituted to create new concepts and expressions. A new concept or expression created in this manner may, or may not, be logical and useful. Although there is no guarantee that either genetic engineering or text morphing will yield useful results each time that it is used, both processes are powerful tools for creating new things when guided by human intuition and interacted with human imagination.
In some respects, morphing text is also analogous to morphing images. The identification of text segments with synonymous starting and end points in second-stage morphing is analogous, in some respects, to the identification of common structural points when morphing two or more images together.
Since second-stage text morphing involves the substitution of phrases that are not synonyms, it is transformational than the first stage. It morphs text meaning as well as writing style. On the downside, this can create narrative that is grammatically correct, but have sections that are absurd or illogical. If you morph a picture of a face with a picture of a car, then the resulting morphed picture may have elements that are absurd or illogical. Faces do not have windshields. Cars do not have eyes. However, on the upside, even absurd or illogical elements can inspire something useful or entertaining. Perhaps a morphed image of a face and a car might inspire you to design a car with lights that look like eyes? Perhaps you might be inspired to create a car with cameras that recognize approaching objects and warn the driver to avoid collisions? Perhaps you might be inspired to create an air-pressure-based “windshield” for the face that protects the wearer from exposure to germs without the need to cover the nose with a mask? Similar inspiration can come from morphing text context.
When morphing text, as when morphing images, morphing things that are similar is more likely to produce a logical and coherent synthesis than morphing things that are very different, but morphing dissimilar things is more likely to produce novel, thought-provoking, and/or entertaining results. The ability to controllably morph two bodies of text with multi-stage text morphing can create a novel combination, integration, or synthesis of Texts A and B that may prompt human imagination toward useful narrative, concepts, or products. The degree of transformation in second-stage text morphing depends on several factors including: the sizes of Text A and Text B; the degree of content and style similarity between Text A, Text B; and the size and relevance of any database of phrase synonyms used to the contents of Text A and Text B.
For the sake of diagrammatic simplicity,
Two such triplets meeting these relational criteria are identified in
In the embodiment shown in
For example, for each triplet, phrase substitution may be done only once and that one time is within the triplet. For example, phrase 502 is only substituted once for phrase 513. In another example, for each triplet, phrase substitution may be done repeatedly and uniformly throughout all of Text B. Phrase 502 may be substituted for any occurrence of phrase 513, or a synonym of 513, throughout Text B. In another example, for each triplet, phrase substitution may be done repeatedly and selectively throughout Text B. Phrase 503 may be substituted for occurrences of phrase 513, or a synonym for phrase 513, that meet certain additional criteria.
In considering these different examples of stage three of this multi-stage method, it should be kept in mind that substituting phrases into specified contexts (such as contexts in which the preceding and following phrases are more defined) will be less intrusive, but also less transformational, than substituting phrases into unspecified contexts (such as contexts in which the preceding and following phrases are less defined). It is a trade-off. The embodiment shown in
There is only one quadruplet identified in
In another example of this method, one could repeat through one or more stages or stage sequences until selected process or outcomes criteria are met. For example, one could cycle repeatedly through all four stages until a selected percentage of the words in Text B have been changed. Morphing toward a defined percentage such as this is analogous, in some respects, to selecting a percentage blend when morphing two images together. In another example of this method, it is possible to change the order of the morphing stages, but generally it makes the most sense to start with the least intrusive and transformational stage (stage one) and then progress along the continuum to the most intrusive and transformational stage (stage four).
This method for multi-stage text morphing can also be used to enhance text-based search engines. Traditionally, text-based search engines respond to a search query by separately evaluating, ranking, and displaying individual sources based on their individual relevance to the search query. For example, a search engine may separately rank a large number of individual sources for relevance to a search query and then display the top ten individual sources on the first page of results.
However, there may be two complementary sources (A and B) that are not ranked high enough to appear on the first page when each is evaluated individually, but which provide the best answer to the search query when their contents are combined. Sources A and B combined provide a more comprehensive answer to the search query than any combination of the sources that appear on the first page of the traditional source engine. The traditional search engine, only evaluating and ranking sources individually, is blind to this. However, an integrative search engine, one that evaluates and ranks combinations of sources, can recognize this and inform the user, of which source combinations are the best. The missing piece for an integrative search engine is a method to combine text sources for combined analysis for relevance to a query. The method for multi-stage text morphing disclosed herein can be this missing piece.
Finally,
This patent application claims the priority benefits of: U.S. Provisional Patent Application 61/336757 entitled “Morphing Text Style” filed on Jan. 25, 2010 by Robert A. Connor; U.S. Provisional Patent Application 61/336758 entitled “Morphing Text by Splicing End-Compatible Segments” filed on Jan. 25, 2010 by Robert A. Connor; and U.S. Provisional Patent Application 61/336759 entitled “Multi-Stage Text Morphing” filed on Jan. 25, 2010 by Robert A. Connor.
Number | Name | Date | Kind |
---|---|---|---|
4456973 | Carlgren et al. | Jun 1984 | A |
4641264 | Nitta et al. | Feb 1987 | A |
4773039 | Zamora | Sep 1988 | A |
5210473 | Backstrand | May 1993 | A |
5265065 | Turtle | Nov 1993 | A |
5584024 | Shwartz | Dec 1996 | A |
5708825 | Sotomayor | Jan 1998 | A |
5717913 | Driscoll | Feb 1998 | A |
5742834 | Kobayashi | Apr 1998 | A |
5953718 | Wical | Sep 1999 | A |
6269368 | Diamond | Jul 2001 | B1 |
6289337 | Davies et al. | Sep 2001 | B1 |
6389409 | Horovitz et al. | May 2002 | B1 |
6424358 | DiDomizio et al. | Jul 2002 | B1 |
6523028 | DiDomizio et al. | Feb 2003 | B1 |
6542889 | Aggarwal et al. | Apr 2003 | B1 |
6632251 | Rutten et al. | Oct 2003 | B1 |
6757692 | Davis et al. | Jun 2004 | B1 |
6847966 | Sommer et al. | Jan 2005 | B1 |
6865572 | Boguraev et al. | Mar 2005 | B2 |
6970859 | Brechner et al. | Nov 2005 | B1 |
7003516 | Dehlinger et al. | Feb 2006 | B2 |
7062487 | Nagaishi et al. | Jun 2006 | B1 |
7113943 | Bradford et al. | Sep 2006 | B2 |
7124362 | Tischer | Oct 2006 | B2 |
7167824 | Kallulli | Jan 2007 | B2 |
7231379 | Parikh et al. | Jun 2007 | B2 |
7231393 | Harik et al. | Jun 2007 | B1 |
7260567 | Parikh et al. | Aug 2007 | B2 |
7292972 | Lin et al. | Nov 2007 | B2 |
7296009 | Jiang et al. | Nov 2007 | B1 |
7366711 | McKeown et al. | Apr 2008 | B1 |
7370056 | Parikh et al. | May 2008 | B2 |
7401077 | Bobrow et al. | Jul 2008 | B2 |
7472343 | Vasey | Dec 2008 | B2 |
7480642 | Koono et al. | Jan 2009 | B2 |
7496621 | Pan et al. | Feb 2009 | B2 |
7499934 | Zhang et al. | Mar 2009 | B2 |
7548913 | Ekberg et al. | Jun 2009 | B2 |
7567976 | Betz et al. | Jul 2009 | B1 |
7580921 | Patterson | Aug 2009 | B2 |
7580929 | Patterson | Aug 2009 | B2 |
7584175 | Patterson | Sep 2009 | B2 |
7587309 | Rohrs et al. | Sep 2009 | B1 |
7587387 | Hogue | Sep 2009 | B2 |
7599899 | Rehberg et al. | Oct 2009 | B2 |
7627562 | Kacmarcik et al. | Dec 2009 | B2 |
7627809 | Balinsky | Dec 2009 | B2 |
7630980 | Parikh | Dec 2009 | B2 |
7636714 | Lamping et al. | Dec 2009 | B1 |
7640158 | Detlef et al. | Dec 2009 | B2 |
7689899 | Leymaster et al. | Mar 2010 | B2 |
7721201 | Grigoriadis et al. | May 2010 | B2 |
20060253431 | Bobick et al. | Nov 2006 | A1 |
20070100823 | Inmon | May 2007 | A1 |
20080319962 | Riezler et al. | Dec 2008 | A1 |
20090018990 | Moraleda | Jan 2009 | A1 |
20090024606 | Schilit et al. | Jan 2009 | A1 |
20090055394 | Schilit et al. | Feb 2009 | A1 |
20090083027 | Hollingsworth | Mar 2009 | A1 |
20090094137 | Toppenberg et al. | Apr 2009 | A1 |
20090193011 | Blair-Goldensohn et al. | Jul 2009 | A1 |
20090216738 | Dexter et al. | Aug 2009 | A1 |
20090216764 | Dexter | Aug 2009 | A1 |
20090217159 | Dexter et al. | Aug 2009 | A1 |
20090217168 | Dexter et al. | Aug 2009 | A1 |
20090292719 | Lachtarnik et al. | Nov 2009 | A1 |
20090313233 | Hanazawa | Dec 2009 | A1 |
20090313243 | Buitelaar et al. | Dec 2009 | A1 |
20100036838 | Ellis | Feb 2010 | A1 |
20100070448 | Omoigui | Mar 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
20110184725 A1 | Jul 2011 | US |
Number | Date | Country | |
---|---|---|---|
61336757 | Jan 2010 | US | |
61336758 | Jan 2010 | US | |
61336759 | Jan 2010 | US |