Method and apparatus for generation and augmentation of search terms from external and internal sources

Description

BACKGROUND OF THE INVENTION

1. Technical Field

The invention relates to speech recognition and speech directed device control. More particularly, the invention relates to a method and apparatus for the generation and augmentation of search terms from external and internal sources, in connection with speech recognition and speech directed device control.

2. Description of the Prior Art

One area of technical innovation is that of navigation of content by spoken and textual command. Such systems typically perform speech recognition by use of a grammar-based ASR (automatic speech recognition) system, where the grammar defines those terms that can be recognized. In such systems, navigated content is comprised of a catalog, content data base, or other repository, for example: currently airing broadcast TV programs, contents of a video-on-demand (VOD) system, a catalog of cell phone ring tones, a catalog of songs, or a catalog of games. Hereafter all of the above sources of content are referred to as a repository.

Content sources are updated and/or expanded on occasion, possibly periodically, possibly as frequently as daily. In some such applications as those described above, content sources are assumed, by both system architects and by system users, to reflect trends and interests in popular culture. However, known recognition systems are limited to recognition of only those phrases that are listed in grammar. Nonetheless, it is desirable to make content sources searchable by names of artists, popular topics, personalities, etc. Yet known ASR systems recognize only those elements that are listed in grammar.

It would be desirable to identify names, personalities, titles, and topics that are present in a repository, and place them into a grammar. It would also be desirable to identify names, personalities, titles, and topics that are not present in the repository, and place them into a grammar; for in this way, such names, personalities, titles and topics may at least be recognized by the ASR system, which can then report that no suitable content is present in the repository.

SUMMARY OF THE INVENTION

The presently preferred embodiment of the invention provides a method and apparatus to identify names, personalities, titles, and topics that are present in a repository. A further embodiment of the invention provides a method and apparatus to identify names, personalities, titles, and topics that are not present in the repository. A key aspect of the invention uses information from external data sources, notably non-speech, text-based searches, to expand the search terms. The expansion takes place in two forms: (1) finding plausible linguistic variants of existing search terms that are already comprehended in the repository, but that are under slightly different names; and (2) expanding the existing search term list with items that should be there by virtue of their currency in popular culture, but which for whatever reason have not yet been reflected with content items in the repository.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block schematic diagram showing search term generation flow according to the invention.

DETAILED DESCRIPTION OF THE INVENTION

The presently preferred embodiment of the invention provides a method and apparatus to identify names, personalities, titles, and topics that are present in a repository. A further embodiment of the invention provides a method and apparatus to identify names, personalities, titles, and topics that are not present in the repository. A key aspect of the invention uses information from external data sources, notably non-speech, text-based searches, to expand the search terms entered. The expansion takes place in two forms: (1) finding plausible linguistic variants of existing search terms that are already comprehended in the repository, but that are present under slightly different names; and (2) expanding the existing search term list with items that should be there by virtue of their currency in popular culture, but which for whatever reason have not yet been reflected with content items in the repository.

An exemplary embodiment of the invention operates as follows:

First, extract search term candidates, also referred to as candidate search terms, from external sources, for instance:

1. Published lists of frequent textual searches against popular search engines, e.g. Yahoo “top searches;”

2. Published lists of popular artists and songs, e.g. music.aol.com/songs/newsongs “Top 100 Songs;”

3. Published lists of popular tags, e.g. ETonline.com “top tags;”

4. Published lists of most-emailed stories, e.g. NYtimes.com most emailed stories, ETonline.com most emailed stories; and

5. Published news feeds, such as RSS feeds, e.g. NYtimes.com/rss.

Nominally for the first three sources listed above, the candidate search terms are clearly identified as an explicitly marked title, author, artist name, etc. and, hence, processing is purely automatic. For the final two sources listed above, a combination of automatic means, such as named entity extraction (NEE) and/or topic detection and tracking (TDT) methods, and possibly direct human intervention, are applied to the running text or titles to generate candidate search terms. However, human intervention may be used with the first group as well.

Next, extract verified search terms from internal sources, for instance:

1. Explicitly marked titles, authors, artist names, etc. that are associated to the content elements in the repository; and/or

2. Sources derived by application of named entity extraction (NEE) and/or topic detection and tracking (TDT) methods to descriptive text associated to the content elements in the repository.

EXAMPLES

- Use of the topic “california fires”, appearing as the tenth-most-popular searched item, as listed in the “MOST POPULAR SEARCHED” section of the website nytimes.com of Oct. 27, 2007.
- Extraction of the proper name “David Brooks” from the frequently emailed article title “David Brooks: The Outsourced Brain,” appearing as the second-most-popular emailed article, as listed in the “MOST POPULAR EMAILED” section of the website nytimes.com of Oct. 27, 2007.

In the presently preferred embodiment of the invention, typical (although not exclusive) means of NEE and TDT analysis may be found in:

- Foundations of Statistical Natural Language Processing, by Chris Manning and Hinrich Schütze, MIT Press. Cambridge, Mass.: May 1999.
- Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Feb. 8-11, 1998, Lansdowne Conference Resort, Lansdowne, Va., available at URL nist.gov/speech/publications/darpa98/

Next, match candidate search terms against verified search terms by well-known linguistic edit distance techniques, to obtain plausible linguistic variants of verified search terms, used to generate the augmented verified search terms.

- Example: “Mary J. Blige” is initial verified search term, augmented with “Mary Blige” as a variant.

Finally, by virtue of their high incidence count, repeated appearance in history as either a candidate or verified search term, or other criterion, include in the candidate search terms which do not point to actual content elements, but which the ASR system should nevertheless recognize. We refer to such elements as “null search terms.”

FIG. 2 is a block schematic diagram showing search term generation flow according to the invention.

In FIG. 1, a grammar is augmented with regard to external sources 11 and internal sources, e.g. the repository, both as discussed above.

External sources comprise, for example, explicitly marked information 12 and running text 15. Explicitly marked text may be subject to an optional count filtering process 14, providing incidence count information is available, whereby only those instances with sufficiently high incidence count are retained, while running text is processed, as discussed above, with a module 17 that performs, for example, named entity extraction (NEE) or topic detection and tracking (TDT). The data from all external sources is combined by a module 18 and an output, comprising candidate search terms (C[i]) 19 is generated. The combined output from external sources is further processed by a module 22 that performs such functions as incidence counting, low pass filtering, and other functions as desired, and is also passed to an approximate text matching module 33 (discussed below). This module 22 also receives historical information, such as a history of candidate search terms (C[i−1] . . . ) 20, a history of final search terms (S[i−1] . . . ) 21, and verified search terms (discussed in greater detail below). The output of the module 22 is provided to a further module 23, which identifies null search terms (N[i]), as discussed above.

Internal sources comprise, for example, explicitly marked information 27 and running text 28. Explicitly marked text may be subject to an optional count filtering process 29, whereby only those instances with sufficiently high incidence count are retained, while running text is processed, as discussed above, with a module 30 that performs, for example, named entity extraction (NEE) or topic detection and tracking (TDT). The data from all internal sources is combined by a module 31 and an output, comprising verified search terms (V[i]) 32 is generated. The verified search terms are used in connection with the module 22, as discussed above. The verified search terms are also provided to a module 33 for approximate text matching by linguistic edit distance techniques. The module 33 also receives candidate search terms from the module 19 as an input. The output of the module 33 is provided to a module 34 that generates augmented verified search terms (AV[i]).

The processed external sources information that is output by the module 23 and the processed internal sources information that is output by the module 34 are provided as inputs to a combining module 34 to produce final search terms (S[i]) 25, which are output.

Although the invention is described herein with reference to the preferred embodiment, one skilled in the art will readily appreciate that other applications may be substituted for those set forth herein without departing from the spirit and scope of the present invention. Accordingly, the invention should only be limited by the Claims included below.

Claims

1. An apparatus for identifying names, personalities, titles, and topics, whether or not said names, personalities, titles and topics are present in a given repository, comprising: a plurality of external data sources, comprising non-speech, published lists of the text of frequent searches presented to popular text-based search engines, published lists of popular artists and song titles, published lists of most popular tags, published lists of most-emailed stories, and published news feeds;a processor configured for extracting search term candidates from said external sources, the step of extracting further comprising: extracting candidate search terms from at least one document from among a plurality of documents available from a plurality of sources of unstructured published content available over a computer network, wherein said sources of unstructured published content at least includes sources selected from among a group of sources consisting of published lists of most-emailed stories and published news feeds, and wherein extracting further comprises an automatic extraction means selected from among: named entity extraction (NEE);topic detection and tracking (TDT);direct human intervention; anda combination of NEE, TDT, and direct human intervention;storing said candidate search terms in a historical database of candidate search terms;said processor configured for extracting verified search terms from one or more internal sources;said processor configured for expanding search terms entered using information from said external data sources, said means for expanding search terms comprising means for matching candidate search terms against verified search terms by applying linguistic edit distance techniques to obtain plausible linguistic variants of verified search terms and further comprising: said processor configured for finding plausible linguistic variants of existing search terms that are already comprehended in the repository, but that are under slightly different names; andsaid processor configured for using said external sources to identify items that should be in an existing search term list by virtue of their currency in popular culture, but which have not yet been included among content items in the repository; said processor configured for expanding said existing search term list with said identified items;said processor configured for using said linguistic variants to generate augmented verified search terms;said processor configured for storing said augmented verified search terms in a historical database of verified search terms;said processor configured for establishing a set of null search terms comprising candidate search terms having a high incidence count in said historical database of candidate search terms and in said historical database of verified search terms; andsaid processor configured for adding said set of search terms comprising any of said augmented verified search terms and said null search terms to any of an automatic speech recognition or natural language processing system.
2. The apparatus of claim 1, said internal sources comprising any of: explicitly marked titles, authors, artist names, that are associated to content elements in said repository.
3. The apparatus of claim 1, said internal sources comprising: sources obtained by application of named entity extraction (NEE) and/or topic detection and tracking (TDT) methods to descriptive text associated to content elements in said repository.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 11/930,951, filed Oct. 31, 2007, which is a divisional application of U.S. patent application Ser. No. 10/699,543, filed Oct. 30, 2003, which claims priority to U.S. provisional patent application Ser. No. 60/422,561, filed Oct. 31, 2002, each of which is incorporated herein in its entirety by this reference thereto.

US Referenced Citations (117)

Number	Name	Date	Kind
5553119	Mcallister et al.	Sep 1996	A
5581655	Cohen et al.	Dec 1996	A
5611019	Nakatoh et al.	Mar 1997	A
5698834	Worthington et al.	Dec 1997	A
5737723	Riley et al.	Apr 1998	A
5752232	Basore et al.	May 1998	A
5774859	Houser et al.	Jun 1998	A
5963903	Hon et al.	Oct 1999	A
6009387	Ramaswamy et al.	Dec 1999	A
6012058	Fayyad et al.	Jan 2000	A
6021387	Mozer et al.	Feb 2000	A
6130726	Darbee et al.	Oct 2000	A
6141640	Moo	Oct 2000	A
6182039	Rigazio et al.	Jan 2001	B1
6260013	Sejnoha	Jul 2001	B1
6263308	Heckerman et al.	Jul 2001	B1
6298324	Zuberec et al.	Oct 2001	B1
6301560	Masters	Oct 2001	B1
6320947	Joyce et al.	Nov 2001	B1
6336091	Polikaitis et al.	Jan 2002	B1
6374177	Lee et al.	Apr 2002	B1
6374226	Hunt et al.	Apr 2002	B1
6381316	Joyce et al.	Apr 2002	B2
6408272	White et al.	Jun 2002	B1
6415257	Junqua et al.	Jul 2002	B1
6424935	Taylor	Jul 2002	B1
6446035	Grefenstette et al.	Sep 2002	B1
6658414	Bryan et al.	Dec 2003	B2
6665644	Kanevsky et al.	Dec 2003	B1
6711541	Kuhn et al.	Mar 2004	B1
6711543	Cameron	Mar 2004	B2
6714632	Joyce et al.	Mar 2004	B2
6721633	Funk et al.	Apr 2004	B2
6725022	Clayton et al.	Apr 2004	B1
6728531	Lee et al.	Apr 2004	B1
6799201	Lee et al.	Sep 2004	B1
6804653	Gabel	Oct 2004	B2
6892083	Shostak	May 2005	B2
6901366	Kuhn et al.	May 2005	B1
6975993	Keiller	Dec 2005	B1
6985865	Packingham et al.	Jan 2006	B1
7020609	Thrift et al.	Mar 2006	B2
7027987	Franz et al.	Apr 2006	B1
7062477	Fujiwara et al.	Jun 2006	B2
7113981	Slate	Sep 2006	B2
7117159	Packingham et al.	Oct 2006	B1
7158959	Chickering et al.	Jan 2007	B1
7188066	Falcon et al.	Mar 2007	B2
7203645	Pokhariyal et al.	Apr 2007	B2
7231380	Pienkos	Jun 2007	B1
7263489	Cohen et al.	Aug 2007	B2
7324947	Jordan et al.	Jan 2008	B2
7428555	Yan	Sep 2008	B2
7447636	Schwartz et al.	Nov 2008	B1
7483885	Chandrasekar et al.	Jan 2009	B2
7519534	Maddux et al.	Apr 2009	B2
7654455	Bhatti et al.	Feb 2010	B1
7769786	Patel	Aug 2010	B2
7809601	Shaya et al.	Oct 2010	B2
7904296	Morris	Mar 2011	B2
7934658	Bhatti et al.	May 2011	B1
7949526	Ju et al.	May 2011	B2
8165916	Hoffberg et al.	Apr 2012	B2
8321278	Haveliwala et al.	Nov 2012	B2
8321427	Stampleman et al.	Nov 2012	B2
8515753	Kim et al.	Aug 2013	B2
20010019604	Joyce et al.	Sep 2001	A1
20020015480	Daswani et al.	Feb 2002	A1
20020032549	Axelrod et al.	Mar 2002	A1
20020032564	Ehsani et al.	Mar 2002	A1
20020046030	Haritsa et al.	Apr 2002	A1
20020049535	Rigo et al.	Apr 2002	A1
20020106065	Joyce et al.	Aug 2002	A1
20020146015	Bryan et al.	Oct 2002	A1
20030004728	Keiller	Jan 2003	A1
20030028380	Freeland et al.	Feb 2003	A1
20030033152	Cameron	Feb 2003	A1
20030061039	Levin	Mar 2003	A1
20030065427	Funk et al.	Apr 2003	A1
20030068154	Zylka et al.	Apr 2003	A1
20030069729	Bickley et al.	Apr 2003	A1
20030073434	Shostak	Apr 2003	A1
20030093281	Geilhufe et al.	May 2003	A1
20030125928	Lee et al.	Jul 2003	A1
20030177013	Falcon et al.	Sep 2003	A1
20030212702	Campos et al.	Nov 2003	A1
20040077334	Joyce et al.	Apr 2004	A1
20040110472	Witkowski et al.	Jun 2004	A1
20040127241	Shostak	Jul 2004	A1
20040132433	Stern et al.	Jul 2004	A1
20040199498	Kapur et al.	Oct 2004	A1
20050010412	Aronowitz	Jan 2005	A1
20050071224	Fikes et al.	Mar 2005	A1
20050125224	Myers et al.	Jun 2005	A1
20050143139	Park et al.	Jun 2005	A1
20050144251	Slate	Jun 2005	A1
20050170863	Shostak	Aug 2005	A1
20050228670	Mahajan et al.	Oct 2005	A1
20060018440	Watkins et al.	Jan 2006	A1
20060028337	Li	Feb 2006	A1
20060050686	Velez et al.	Mar 2006	A1
20060064177	Tian et al.	Mar 2006	A1
20060085521	Sztybel	Apr 2006	A1
20060136292	Bhati et al.	Jun 2006	A1
20060149635	Bhatti et al.	Jul 2006	A1
20060206339	Silvera et al.	Sep 2006	A1
20060206340	Silvera et al.	Sep 2006	A1
20060259467	Westphal	Nov 2006	A1
20060271546	Phung	Nov 2006	A1
20070027864	Collins et al.	Feb 2007	A1
20070033003	Morris	Feb 2007	A1
20070067285	Blume et al.	Mar 2007	A1
20080021860	Wiegering et al.	Jan 2008	A1
20080103887	Oldham et al.	May 2008	A1
20080103907	Maislos et al.	May 2008	A1
20080250448	Rowe et al.	Oct 2008	A1
20090048910	Shenfield et al.	Feb 2009	A1

Foreign Referenced Citations (24)

Number	Date	Country
1341363	Sep 2003	EP
1003018	May 2005	EP
1633150	Mar 2006	EP
1633151	Mar 2006	EP
1742437	Jan 2007	EP
WO-0016568	Mar 2000	WO
WO-0021232	Apr 2000	WO
WO-0122112	Mar 2001	WO
WO-0122249	Mar 2001	WO
WO-0122633	Mar 2001	WO
WO-0122712	Mar 2001	WO
WO-0122713	Mar 2001	WO
WO-0139178	May 2001	WO
WO-0157851	Aug 2001	WO
WO-0207050	Jan 2002	WO
WO-0211120	Feb 2002	WO
WO-0217090	Feb 2002	WO
WO-02097590	Dec 2002	WO
WO-2004077721	Sep 2004	WO
WO-2006033841	Mar 2006	WO
WO-2006098789	Sep 2006	WO
WO-2004021149	Mar 2007	WO
WO-2005079254	May 2007	WO
WO-2006029269	May 2007	WO

Non-Patent Literature Citations (8)

Entry
Amir, A. et al., “Advances in Phonetic Word Spotting”, IBM Research Report RJ 10215, Aug. 2001, pp. 1-3.
Belzer, et al., “Symmetric Trellis-Coded Vector Quantization”, IEEE Transactions on Communications, IEEE Service Center, Piscataway, NJ, vol. 45, No. 45, par. II, figure 2, Nov. 1997, pp. 1354-1357.
Chan, et al., “Efficient Codebook Search Procedure for Vector-Sum Excited Linear Predictive Coding of Speech”, IEEE Electronics Letters; vol. 30, No. 22; Stevanage, GB, ISSN 0013-5194, Oct. 27, 1994, pp. 1830-1831.
Chan, , “Fast Stochastic Codebook Search Through the Use of Odd-Symmetric Crosscorrelation Basis Vectors”, Int'l Conference on Acoustics, Speech and Signal Processing; Detroit, Michigan, vol. 1, Par. 1; ISBN 0-7803-2461-5, May 1995, pp. 21-24.
Chen, et al., “Diagonal Axes method (DAM): A Fast Search Algorithm for Vector Quantization”, IEEE Transactions on Circuits and Systems for Video Technology, Piscataway, NJ; vol. 7, No. 3, ISSN 1051-8215; Par. I, II, Jun. 1997.
Hanzo, et al., “Voice Compression and Communications—Principles and Applications for Fixed and Wireless Channels”, Wiley, ISBN 0-471-15039-8; par. 4.3.3, 2001.
Salami, et al., “A Fully Vector Quantised Self-Excited Vocoder”, Int'l Conference on Acoustics, Speech & Signal Processing; vol. 1, par. 3.1; Glasgow, May 1989.
Schotz, S. , “Automatic prediction of speaker age using CART”, Course paper for course in Speech Recognition, Lund University, retrieved online from url: http://person2.sol.lu.se/SusznneSchotz/downloads/SR—paper—SusanneS2004.pdf, 2003, 8 pages.

Related Publications (1)

	Number	Date	Country
	20130060789 A1	Mar 2013	US

Provisional Applications (1)

	Number	Date	Country
	60422561	Oct 2002	US

Divisions (1)

	Number	Date	Country
Parent	10699543	Oct 2003	US
Child	11930951		US

Continuations (1)

	Number	Date	Country
Parent	11930951	Oct 2007	US
Child	13667446		US

Method and apparatus for generation and augmentation of search terms from external and internal sources

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

CPC

International Classifications

Disclaimer

Abstract