Claims
- 1. A multimedia recognition system, comprising:
a plurality of indexers configured to:
receive multimedia data, and analyze the multimedia data based on training data to generate a plurality of documents; and a memory system configured to:
store the documents from the indexers, receive user augmentation relating to one of the documents, and provide the user augmentation to one or more of the indexers for retraining based on the user augmentation.
- 2. The system of claim 1, wherein the multimedia data includes at least two of audio data, video data, and text data.
- 3. The system of claim 2, wherein the indexers include at least two of:
an audio indexer configured to perform speech recognition on the audio data based on the training data, a video indexer configured to perform at least one of video recognition and speech recognition on the video data based on the training data, and a text indexer configured to perform text recognition on the text data based on the training data.
- 4. The system of claim 1, wherein when receiving user augmentation relating to one of the documents, the memory system is configured to:
receive correction of the one of the documents, as a corrected document, from a user, and store the corrected document.
- 5. The system of claim 4, wherein when providing the user augmentation to one or more of the indexers, the memory system is configured to send the corrected document to the one or more of the indexers for retraining based on the corrected document.
- 6. The system of claim 5, wherein the one or more of the indexers are configured to:
add the corrected document to the training data, and retrain based on the training data.
- 7. The system of claim 1, wherein when receiving user augmentation relating to one of the documents, the memory system is configured to:
receive enhancement of the one of the documents, as an enhanced document, from a user, and store the enhanced document.
- 8. The system of claim 7, wherein when providing the user augmentation to one or more of the indexers, the memory system is configured to send the enhanced document to the one or more of the indexers for retraining based on the enhanced document.
- 9. The system of claim 8, wherein the one or more of the indexers are configured to:
add the enhanced document to the training data, and retrain based on the training data.
- 10. The system of claim 1, wherein when receiving user augmentation relating to one of the documents, the memory system is configured to:
receive annotation of the one of the documents, as an annotated document, from a user, and store the annotated document.
- 11. The system of claim 10, wherein when providing the user augmentation to one or more of the indexers, the memory system is configured to send the annotated document to the one or more of the indexers for retraining based on the annotated document.
- 12. The system of claim 11, wherein the one or more of the indexers are configured to:
add the annotated document to the training data, and retrain based on the training data.
- 13. The system of claim 10, wherein when receiving annotation of the one of the documents, the memory system is configured to at least one of:
receive a bookmark relating to the one of the documents, receive highlighting regarding one or more portions of the one of the documents, and receive a note relating to at least a portion of the one of the documents.
- 14. The system of claim 13, wherein the note includes one of comments from the user, one of an audio, video, and text file, and a reference to another one of the documents.
- 15. The system of claim 1, wherein when receiving user augmentation relating to one of the documents, the memory system is configured to:
receive an attachment for the one of the documents from a user, and store the attachment.
- 16. The system of claim 15, wherein when providing the user augmentation to one or more of the indexers, the memory system is configured to send the attachment to the one or more of the indexers for retraining based on the attachment.
- 17. The system of claim 16, wherein the one or more of the indexers are configured to:
add the attachment to the training data, and retrain based on the training data.
- 18. The system of claim 15, wherein the attachment includes one of an audio document, a video document, and a text document, or a reference to the audio document, the video document, or the text document.
- 19. The system of claim 15, wherein the memory system is further configured to:
send the attachment for analysis by one or more of the indexers.
- 20. A multimedia recognition system, comprising:
means for receiving a plurality of types of multimedia data; means for recognizing the multimedia data based on training data to generate recognition results; means for storing the recognition results; means for receiving user augmentation relating to some of the recognition results; means for adding the user augmentation to the training data to obtain new training data; and means for retraining based on the new training data.
- 21. A method for improving recognition results, comprising:
receiving multimedia data; recognizing the multimedia data based on training data to generate a plurality of documents; receiving user augmentation relating to one of the documents; supplementing the training data with the user augmentation to obtain supplemented training data; and retraining based on the supplemented training data.
- 22. The method of claim 21, wherein the multimedia data includes at least two of audio data, video data, and text data.
- 23. The method of claim 22, wherein the recognizing the multimedia data includes at least two of:
performing speech recognition on the audio data based on the training data, performing at least one of video recognition and speech recognition on the video data based on the training data, and performing text recognition on the text data based on the training data.
- 24. The method of claim 21, wherein the receiving user augmentation relating to one of the documents includes:
receiving correction of the one of the documents, as a corrected document, from a user, and storing the corrected document.
- 25. The method of claim 24, wherein the supplementing the training data includes:
adding the corrected document to the training data.
- 26. The method of claim 21, wherein the receiving user augmentation relating to one of the documents includes:
receiving enhancement of the one of the documents, as an enhanced document, from a user, and storing the enhanced document.
- 27. The method of claim 26, wherein the supplementing the training data includes:
adding the enhanced document to the training data.
- 28. The method of claim 21, wherein the receiving user augmentation relating to one of the documents includes:
receiving annotation of the one of the documents, as an annotated document, from a user, and storing the annotated document.
- 29. The method of claim 28, wherein the supplementing the training data includes:
adding the annotated document to the training data.
- 30. The method of claim 28, wherein the receiving annotation of the one of the documents includes at least one of:
receiving a bookmark relating to the one of the documents, receiving highlighting regarding one or more portions of the one of the documents, and receiving a note relating to at least a portion of the one of the documents.
- 31. The method of claim 30, wherein the note includes one of comments from the user, one of an audio, video, and text file, and a reference to another one of the documents.
- 32. The method of claim 21, wherein the receiving user augmentation relating to one of the documents includes:
receiving an attachment for the one of the documents from a user, and storing the attachment.
- 33. The method of claim 32, wherein the supplementing the training data includes:
adding the attachment to the training data.
- 34. The method of claim 32, wherein the attachment includes one of an audio document, a video document, and a text document, or a reference to the audio document, the video document, or the text document.
- 35. The method of claim 23, further comprising:
performing at least one of speech recognition, video recognition, and text recognition on the attachment.
- 36. A computer-readable medium that stores instructions executable by one or more processors for improving recognition of multimedia data, comprising:
instructions for acquiring multimedia data; instructions for recognizing the multimedia data based on training data to generate a plurality of documents; instructions for obtaining user augmentation relating to one of the documents; instructions for adding the user augmentation to the training data to obtain new training data; and instructions for retraining based on the new training data.
- 37. A multimedia recognition system, comprising:
a plurality of indexers configured to:
receive multimedia data, and analyze the multimedia data based on training data to generate a plurality of documents; and a memory system configured to:
store the documents from the indexers, obtain new documents, store the new documents, and provide the new documents to one or more of the indexers for retraining based on the new documents.
- 38. The system of claim 37, wherein the multimedia data includes at least two of audio data, video data, and text data.
- 39. The system of claim 38, wherein the indexers include at least two of:
an audio indexer configured to perform speech recognition on the audio data based on the training data, a video indexer configured to perform at least one of video recognition and speech recognition on the video data based on the training data, and a text indexer configured to perform text recognition on the text data based on the training data.
- 40. The system of claim 37, wherein when obtaining one of the new documents, the memory system is configured to at least one of:
receive text that has been cut-and-pasted, receive a file containing the one of the new documents, and receive a link to the one of the new documents.
- 41. The system of claim 37, wherein when obtaining the new documents, the memory system is configured to:
employ an agent to actively seek out and retrieve new documents.
- 42. The system of claim 37, wherein the one or more of the indexers are configured to:
add the new documents to the training data, and retrain based on the training data.
- 43. A multimedia recognition system, comprising:
means for receiving a plurality of types of multimedia data; means for recognizing the multimedia data based on training data to generate recognition results; means for obtaining new documents from one or more users; means for adding the new documents to the training data to obtain new training data; and means for retraining based on the new training data.
- 44. A method for improving recognition results, comprising:
receiving multimedia data; recognizing the multimedia data based on training data to generate a plurality of documents; obtaining new documents; supplementing the training data with the new documents to obtain supplemented training data; and retraining based on the supplemented training data.
- 45. The method of claim 44, wherein the multimedia data includes at least two of audio data, video data, and text data.
- 46. The method of claim 45, wherein the recognizing the multimedia data includes at least two of:
performing speech recognition on the audio data based on the training data, performing at least one of video recognition and speech recognition on the video data based on the training data, and performing text recognition on the text data based on the training data.
- 47. The method of claim 44, wherein the obtaining new documents includes at least one of:
receiving text that has been cut-and-pasted, receiving one or more files containing the new documents, and receiving one or more links to the new documents.
- 48. The method of claim 44, wherein the obtaining the new documents includes:
actively seeking out and retrieving the new documents.
RELATED APPLICATION
[0001] This application claims priority under 35 U.S.C. § 119 based on U.S. Provisional Application Nos. 60/394,064 and 60/394,082, filed Jul. 3, 2002, and Provisional Application No. 60/419,214, filed Oct. 17, 2002, the disclosures of which are incorporated herein by reference.
[0002] This application is related to U.S. patent application Ser. No. 10/______ (Docket No. 02-4042), entitled, “Continuous Learning for Speech Recognition Systems,” filed concurrently herewith, the disclosure of which is incorporated herein by reference.
Provisional Applications (3)
|
Number |
Date |
Country |
|
60394064 |
Jul 2002 |
US |
|
60394082 |
Jul 2002 |
US |
|
60419214 |
Oct 2002 |
US |