1. Technical Field
The present disclosure relates to speech to text converting devices, and particularly to, a speech to text converting device and a text to speech converting method.
2. Description of Related Art
Speech, or the spoken word, needs be recorded in many fields. However, traditionally a reader cannot know the identity of a speaker when his voice content is converted to text.
Therefore, there is room for improvement within the art.
Many aspects of the embodiments can be better understood with reference to the following drawings. The components in the drawings are not necessarily drawn to scale, the emphasis instead being placed upon clearly illustrating the principles of the embodiments. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
The disclosure is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean at least one.
In general, the word “module”, as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions, written in a programming language, such as, Java, C, or Assembly. One or more software instructions in the modules may be embedded in firmware, such as EPROM. The modules described herein may be implemented as either software and/or hardware modules and may be stored in any type of non-transitory computer-readable medium or other storage device. Some non-limiting examples of non-transitory computer-readable media include CDs, DVDs, BLU-RAY, flash memory, and hard disk drives.
Referring to
The storing module 10 stores different text data and data as to different human identities which corresponds with passages of speech recorded from each of those different persons. The voice receiving module 40 receives audible speech from an external source. Within a period of time, the recognition module 20 converts the audible speech to voice data and sends text data corresponding to the spoken word to the control module 30. The identity recognition module 50 determines the identity of the speaker who is associated with that particular voice data and sends such identity data from the storing module 10 to the control module 30. The control module 30 displays the text data and the identity data.
Referring to
In step S201, the voice receiving module 40 receives audible speech in successive periods of time and sends the speech to the voice recognition module 20 and the identity recognition module 50.
In step S202, the voice recognition module 20 converts the speech to a voice data and sends text data associated with the voice data from the storing module 10 to the control module 30, and the identity recognition module 50 sends data as to the identity of the speaker it has determined to be associated with the speech to the control module 30.
In step S203, the control module 30 displays the text data and the identity data on the display 60.
Referring to
In step S2021, the identity recognition module 50 samples the speech.
In step S2022, the identity recognition module 50 compares the speech received against different reference speeches from the storing module 10, each reference speech corresponding to an identity.
In step S2023, the identity recognition module 50 looks up the identity data associated with the speech.
In step S2024, the identity recognition module 50 determines the duration of the complete speech and sends the identity data and data as to the duration to the control module 30.
Referring to
In step S2031, the control module 30 receives data as to the duration of the complete speech.
In step S2032, the control module 30 determines the particular text data which corresponds throughout to the duration of the complete speech.
In step S2033, the control module 30 displays the identity data and the text data. For example, if the text data is “welcome our manager to give a speech”, and the corresponding identity data is Mr. Green, the display 60 displays “Mr. Green: welcome our manager to give a speech”.
It is to be understood, however, that even though numerous characteristics and advantages of the embodiments have been set forth in the foregoing description, together with details of the structure and function of the embodiments, the disclosure is illustrative only, and changes may be made in detail, especially in matters of shape, size, and arrangement of parts within the principles of the present disclosure to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed.
Depending on the embodiment, certain of the steps of a method(s) described may be removed, others may be added, and the sequence of steps may be altered. It is also to be understood that the description and the claims drawn for a method may include some indication in reference to certain steps. However, the indication used is only to be viewed for identification purposes and not as a suggestion as to an order for the steps.
Number | Date | Country | Kind |
---|---|---|---|
100100927 | Jan 2011 | TW | national |
This application is related to co-pending U.S. patent application entitled “SPEECH TO TEXT CONVERTING DEVICE AND METHOD”, Attorney Docket No. US37058, U.S. application Ser. No. ______ filed on ______.