Claims
- 1. A method for storing multimedia data items in a database, comprising:
receiving data items from a plurality of types of media sources; identifying regions of the data items, the regions including document regions, section regions, and passage regions, each of the section regions corresponding to one of the document regions, each of the passage regions corresponding to one of the section regions and one of the document regions; generating document keys for the document regions; generating section keys for the section regions; generating passage keys for the passage regions; storing the document keys in a document table in the database; storing the section keys and corresponding ones of the document keys in a section table in the database; and storing the passage keys and corresponding ones of the document keys and the section keys in a passage table in the database.
- 2. The method of claim 1, wherein the media sources include audio sources, video sources, and text sources.
- 3. The method of claim 1, wherein at least one of the data items includes one of the document regions, one or more of the section regions, and one or more of the passage regions.
- 4. The method of claim 1, wherein each of the document regions includes a body of media that is contiguous in time.
- 5. The method of claim 1, wherein each of the section regions includes a contiguous portion that pertains to a theme or topic within one of the document regions.
- 6. The method of claim 1, wherein each of the passage regions includes a contiguous portion that has a linguistic or structural property within one of the section regions.
- 7. The method of claim 1, wherein the document regions, the section regions, and the passage regions form hierarchies.
- 8. The method of claim 1, wherein the document keys uniquely identify corresponding ones of the document regions.
- 9. The method of claim 1, wherein the section keys uniquely identify corresponding ones of the section regions of corresponding ones of the document regions.
- 10. The method of claim 1, wherein the passage keys uniquely identify corresponding ones of the passage regions of corresponding ones of the section regions and corresponding ones of the document regions.
- 11. The method of claim 1, wherein the storing the document keys includes:
creating a plurality of records in the document table, and storing the document keys in separate ones of the records.
- 12. The method of claim 11, wherein the storing the document keys further includes:
storing, for each of the document keys, in one of the records of the document table, at least one of a time the document region was created, a source of the document region, a title of the document region, a time the document region started, a country in which the document region originated, and a language in which the document region was created.
- 13. The method of claim 1, wherein the storing the section keys includes:
creating a plurality of records in the section table, and storing the section keys and corresponding ones of the document keys in separate ones of the records.
- 14. The method of claim 13, wherein the storing the section keys further includes:
storing, for each of the section keys, in one of the records of the section table, at least one of a start time of the section region, a duration of the section region, and a language in which the section region was created.
- 15. The method of claim 1, wherein the storing the passage keys includes:
creating a plurality of records in the passage table, and storing the passage keys and corresponding ones of the document keys and the section keys in separate ones of the records.
- 16. The method of claim 15, wherein the storing the passage keys further includes:
storing, for each of the passage keys, in one of the records of the passage table at least one of a start time of the passage region, a duration of the passage region, a name of a speaker in the passage region, a gender of a speaker in the passage region, and a language in which the passage region was created.
- 17. The method of claim 1, further comprising:
creating a full text table; and storing text relating to the data items in the full text table.
- 18. The method of claim 1, further comprising:
creating a topic labels table; and storing topics relating to the section regions in the topic labels table.
- 19. The method of claim 1, further comprising:
creating a named entity table; and storing, in the named entity table, names of people, places, and organizations identified within the passage regions, the section regions, or the document regions.
- 20. The method of claim 1, further comprising:
creating a facts table; and storing, in the facts table, factual information regarding people, places, and organizations identified within the passage regions, the section regions, or the document regions.
- 21. The method of claim 1, further comprising:
identifying words located in at least one of the passage regions, the section regions, and the document regions; generating time/offset keys for each of the words, the time/offset keys identifying times at which corresponding ones of the words were spoken or character offsets of corresponding ones of the words; and storing the time/offset keys in a time/offset table in the database.
- 22. The method of claim 21, wherein each of the time/offset keys corresponds to one of the passage regions, one of the section regions, and one of the document regions; and
wherein the storing the time/offset keys includes:
storing the time/offset keys and corresponding ones of the document keys, the section keys, and the passage keys in the time/offset table.
- 23. A system for facilitating searching and retrieval of multimedia data items, comprising:
means for receiving data items from a plurality of types of media sources; means for identifying regions in the data items, the regions including document regions, section regions, and passage regions, each of the section regions corresponding to one of the document regions, each of the passage regions corresponding to one of the section regions and one of the document regions; means for generating document keys that identify the document regions; means for generating section keys that identify the section regions; means for generating passage keys that identify the passage regions; means for creating a document table, a section table, and a passage table; means for storing the document keys in the document table; means for storing the section keys and corresponding ones of the document keys in the section table; and means for storing the passage keys and corresponding ones of the document keys and the section keys in the passage table.
- 24. A system for facilitating searching and retrieval of multimedia data items, comprising:
a database configured to store:
a document table that includes a plurality of document records, a section table that includes a plurality of section records, and a passage table that includes a plurality of passage records; and a loader connected to the database and configured to:
receive data items from a plurality of types of media sources, identify regions in the data items, the regions including document regions, section regions, and passage regions, each of the section regions corresponding to one of the document regions, each of the passage regions corresponding to one of the section regions and one of the document regions, store document identifiers relating to the document regions in separate ones of the document records in the document table, store section identifiers relating to the section regions in separate ones of the section records in the section table, and store passage identifiers relating to the passage regions in separate ones of the passage records in the passage table.
- 25. The system of claim 24, wherein when storing the section identifiers, the loader is configured to:
store section identifiers and corresponding ones of the document identifiers in separate ones of the section records.
- 26. The system of claim 24, wherein when storing the passage identifiers, the loader is configured to:
store passage identifiers and corresponding ones of the document identifiers and the section identifiers in separate ones of the passage records.
- 27. The system of claim 24, wherein the media sources include audio sources, video sources, and text sources.
- 28. The system of claim 24, wherein at least one of the data items includes one of the document regions, one or more of the section regions, and one or more of the passage regions.
- 29. The system of claim 24, wherein each of the document regions includes a body of media that is contiguous in time.
- 30. The system of claim 24, wherein each of the section regions includes a contiguous portion that pertains to a theme or topic within one of the document regions.
- 31. The system of claim 24, wherein each of the passage regions includes a contiguous portion that has a linguistic or structural property within one of the section regions.
- 32. The system of claim 24, wherein the document regions, the section regions, and the passage regions form hierarchies.
- 33. The system of claim 24, wherein the document identifiers uniquely identify corresponding ones of the document regions, the section identifiers uniquely identify corresponding ones of the section regions of corresponding ones of the document regions, and the passage identifiers uniquely identify corresponding ones of the passage regions of corresponding ones of the section regions and corresponding ones of the document regions.
- 34. The system of claim 24, wherein when storing the document identifiers, the loader is configured to:
store, for each of the document identifiers, in one of the document records, at least one of a time the document region was created, a source of the document region, a title of the document region, a time the document region started, a country in which the document region originated, and a language in which the document region was created.
- 35. The system of claim 24, wherein when storing the section identifiers, the loader is configured to:
store, for each of the section identifiers, in one of the section records, at least one of a start time of the section region, a duration of the section region, and a language in which the section region was created.
- 36. The system of claim 24, wherein when storing the passage identifiers, the loader is configured to:
store, for each of the passage identifiers, in one of the passage records, at least one of a start time of the passage region, a duration of the passage region, a name of a speaker in the passage region, a gender of a speaker in the passage region, and a language in which the passage region was created.
- 37. The system of claim 24, wherein the database is further configured to store:
a full text table that stores text relating to the data items.
- 38. The system of claim 24, wherein the database is further configured to store:
a topic labels table that stores topics relating to the section regions.
- 39. The system of claim 24, wherein the database is further configured to store:
a named entity table that stores names of people, places, and organizations identified within the passage regions, the section regions, or the document regions.
- 40. The system of claim 24, wherein the database is further configured to store:
a facts table that stores factual information regarding people, places, and organizations identified within the passage regions, the section regions, or the document regions.
- 41. The system of claim 24, wherein the database is further configured to store:
a time/offset table that includes a plurality of time/offset records.
- 42. The system of claim 41, wherein the loader is further configured to:
identify words located in at least one of the passage regions, the section regions, and the document regions, and store time/offset identifiers relating to the words in separate ones of the time/offset records, the time/offset identifiers identifying times at which corresponding ones of the words were spoken or character offsets of corresponding ones of the words.
- 43. The system of claim 42, wherein at least one of the time/offset identifiers corresponds to one of the passage regions, one of the section regions, and one of the document regions.
- 44. A database that stores data items relating to a plurality of media types, the data items including a plurality of regions, the regions including document regions, section regions, and passage regions, each of the section regions corresponding to one of the document regions, each of the passage regions corresponding to one of the section regions and one of the document regions, the database comprising:
a document table configured to store a plurality of document keys that identify the document regions as a plurality of document records; a section table configured to store a plurality of section keys that identify the section regions as a plurality of section records, the section records also storing corresponding ones of the document keys; and a passage table configured to store a plurality of passage keys that identify the passage regions as a plurality of passage records, the passage records also storing corresponding ones of the section keys and the document keys.
- 45. The database of claim 44, wherein the document keys are primary keys in the document table.
- 46. The database of claim 44, wherein the section keys and corresponding ones of the document keys are primary keys in the section table.
- 47. The database of claim 44, wherein the passage keys and corresponding ones of the section keys and the document keys are primary keys in the passage table.
- 48. The database of claim 44, wherein the data items further include a plurality of words, the database further comprising:
a time/offset table configured to store a plurality of time/offset keys that relate to the words as a plurality of time/offset records, the time/offset records also storing one or more of the document keys, the section keys, and the passage keys.
RELATED APPLICATIONS
[0001] This application claims priority under 35 U.S.C. §119 based on U.S. Provisional Application Nos. 60/394,064 and 60/394,082, filed Jul. 3, 2002, and Provisional Application No. 60/419,214, filed Oct. 17, 2002, the disclosures of which are incorporated herein by reference.
GOVERNMENT CONTRACT
[0002] The U.S. Government may have a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Contract No. N66001-00-C-8008 awarded by the Defense Advanced Research Projects Agency.
Provisional Applications (3)
|
Number |
Date |
Country |
|
60394064 |
Jul 2002 |
US |
|
60394082 |
Jul 2002 |
US |
|
60419214 |
Oct 2002 |
US |