Claims
- 1. A speech transcription tool comprising:
control logic configured to play back portions of an audio stream; an input device configured to receive text from a user defining a transcription of the portions of the audio stream and receive annotation information from the user further defining the text; and a graphical user interface including
a first section configured to display a graphical representation of a waveform corresponding to the audio stream, and a second section configured to display the text and representations of the annotation information for the text.
- 2. The speech transcription tool of claim 1, wherein the graphical user interface further includes:
a third section configured to display a hierarchically structured representation of the text.
- 3. The speech transcription tool of claim 1, wherein the first section of the graphical user interface further includes graphical markers that define the portions of the audio stream.
- 4. The speech transcription tool of claim 1, wherein the representations of the annotation information include graphical icons.
- 5. The speech transcription tool of claim 1, wherein the input device further receives information from the user classifying the portions of the audio stream into a plurality of hierarchical segments.
- 6. The speech transcription tool of claim 5, wherein the segments include speaker turns, sections, and episodes.
- 7. The speech transcription tool of claim 1, wherein the control logic writes the transcription of the portions of the audio stream and the annotation information as a Unicode output file.
- 8. The speech transcription tool of claim 1, wherein the annotation information is selected from a possible set of annotation information defined by a configuration file.
- 9. The speech transcription tool of claim 1, wherein the annotation information is entered by the user through predefined keyboard shortcuts.
- 10. A method comprising:
receiving an audio stream containing speech data; receiving text from a user defining a transcription of the speech data; receiving annotation information from the user further defining the text; displaying the text; and displaying symbolic representations of the annotation information with the text.
- 11. The method of claim 10, further comprising:
displaying a graphical representation of a waveform corresponding to the audio stream.
- 12. The method of claim 11, wherein the graphical representation of the waveform includes graphical markers that define segments within the audio stream.
- 13. The method of claim 12, wherein the graphical markers are adjustable by the user, and wherein adjusting the markers adjusts a corresponding definition of a segment.
- 14. The method of claim 12, wherein the segments include speaker turns, sections, and episodes.
- 15. The method of claim 10, further comprising:
categorizing the audio stream into a plurality of hierarchically arranged segments, and displaying the hierarchically arranged segments.
- 16. The method of claim 10, wherein the symbolic representations of the annotation information include graphical icons.
- 17. The method of claim 10, wherein the annotation information is selected from a possible set of annotation information defined by a configuration file.
- 18. The method of claim 10, wherein the user enters the annotation information using predefined keyboard shortcuts.
- 19. A computing device for transcribing an audio file that includes speech, the computing device comprising:
an audio output device; a processor; and a computer memory coupled to the processor and containing programming instructions that when executed by the processor cause the processor to:
play a current one of a plurality of segments of the audio file through the audio output device, receive transcription information for speech segments of the segments of the audio file played through the audio output device, receive annotation information relating to the transcription information, and display the transcription information in an output section of a graphical user interface, and display the annotation information as graphical icons in the output section of the graphical user interface.
- 20. The computing device of claim 19, wherein the programming instructions additionally cause the processor to:
display a graphical representation of a waveform corresponding to the audio file.
- 21. The computing device of claim 20, wherein the graphical representation of the waveform includes graphical markers that represent the segments of the audio file.
- 22. The computing device of claim 19, wherein the graphical icons are displayed overlaid with the transcription information.
- 23. The computing device of claim 19, further comprising:
an input device configured to receive information from the user classifying the segments of the audio file.
- 24. The computing device of claim 23, wherein the segments include speaker turns, sections, and episodes.
- 25. The speech transcription tool of claim 19, wherein the processor writes the transcription information and the annotation information to a Unicode output file.
- 26. The speech transcription tool of claim 19, wherein the annotation information is selected from a possible set of annotation information defined by a configuration file.
- 27. A computer-readable medium containing program instructions for execution by a processor, the program instructions comprising:
instructions for obtaining an audio stream containing speech data; instructions for receiving text from a user that defines a transcription of the speech data; instructions for receiving annotation information from the user further defining the text; instructions for presenting the text; and instructions for providing symbolic representations of the annotation information with the text.
- 28. A device comprising:
means for receiving an audio stream containing speech data; means for receiving text from a user defining a transcription of the speech data; means for receiving annotation information from the user further defining the text; means for displaying the text; and means for displaying symbolic representations of the annotation information as graphical icons associated with the text.
RELATED APPLICATION
[0001] This application is related to the concurrently-filed U.S. application (Docket No. 02-4040), Ser. No. ______, titled “Fast Transcription of Speech,” which is incorporated herein by reference.
[0002] This application claims priority under 35 U.S.C. § 119 based on U.S. Provisional Application No. 60/419,214 filed Oct. 17, 2002, the disclosure of which is incorporated herein by reference.
GOVERNMENT CONTRACT
[0003] The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reason-able terms as provided for by the terms of (contract No. 1999*S018900*000) awarded by Federal Broadcast Information Service (FBIS).
Provisional Applications (1)
|
Number |
Date |
Country |
|
60419214 |
Oct 2002 |
US |