1. Field of the Invention
The present invention relates generally to an audio data managing method and apparatus of an electronic device and, more particularly, to a method and an apparatus for managing audio data in an electronic device, which allow preliminary identification of contents of audio data.
2. Description of the Related Art
With the remarkable development in the information communication technologies and semiconductor technologies, electronic devices can now store thousands of various data files, which include, for example, document data files, image data files, video data files, and audio data files. The storing of such a large number of data files has given users of the electronic devices trouble in managing the data. For example, a user of the electronic device may feel trouble or inconvenience in finding data which the user wants to obtain. In order to overcome such trouble or inconvenience and efficiently manage data, the prior art electronic devices allow making of a filename of each data file or inputting separate tag information thereof so that the user of the electronic device can identify the contents of the data.
Conventional electronic devices also provide a preview function for preliminarily showing rough contents of document data, image data, or video data, through a thumbnail image. Through the preview function, a user of the electronic device can preliminarily identify the contents of document data, image data, or video data.
However, there are no electronic devices that provide a preview function for audio data. As a result, users of the electronic devices still have the inconvenience of recording a detailed filename of audio data or inputting separate tag information of the audio data.
The present invention has been made to address the above-described problems and disadvantages, and to provide at least the advantages described below. Accordingly, an aspect of the present invention is to provide a method and an apparatus for managing audio data in an electronic device, which allow preliminary identification of contents of audio data through a text.
Another aspect of the present invention is to provide a method and an apparatus for managing audio data in an electronic device, by which a part of a text converted from audio data can be set as a filename of the audio data.
Another aspect of the present invention is to provide a method and an apparatus for managing audio data in an electronic device, in which talkers are classified according to their tones so as to display converted texts distinguished by the classified talkers.
Another aspect of the present invention is to provide a method and an apparatus for managing audio data in an electronic device, which enable audio data to be searched for based on a text converted from the audio data.
In accordance with an aspect of the present invention, a method of managing audio data of an electronic device is provided. The method includes converting at least a part of audio data to a text; storing the converted text as preview data of the audio data; and displaying the stored preview data in response to a request for a preview of the audio data.
In accordance with another aspect of the present invention, an apparatus for managing audio data in an electronic device is provided. The apparatus includes a display unit for displaying an image; a storage unit for storing one or more audio data; and a controller for converting at least a part of the stored audio data to a text, storing the converted text in the storage unit as preview data, and controlling the display unit such that the stored preview data is displayed in response to a request for a preview of the audio data.
The above and other aspects, features and advantages of the present invention will be more apparent from the following detailed description in conjunction with the accompanying drawings, in which:
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that the same elements will be designated by the same reference numerals although they are shown in different drawings. Further, in the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention unclear.
Meanwhile, embodiments of the present invention shown and described in this specification and the drawings correspond to specific examples presented in order to easily explain technical contents of the present invention, and to help comprehension of the present invention, but are not intended to limit the scope of the present invention. It is obvious to those skilled in the art to which the present invention pertains that other modified embodiments on the basis of the spirit of the present invention besides the embodiments disclosed herein can be carried out.
Prior to the detailed description of the present invention, the electronic device may be a terminal providing a voice recognition function, such as a mobile communication terminal, a smart phone, a tablet Personal Computer (PC), a hand-held PC, a Portable Multimedia Player (PMP), a Personal Digital Assistant (PDA), a notebook PC, and the like.
As described below, a method and an apparatus for managing audio data in an electronic device according to an embodiment of the present invention allow preliminary identification of contents of audio data through a text. That is, in the present invention, audio data is converted to a text and preview data of the audio data is generated and stored based on the converted text, so that the stored preview data can be provided in response to a request for a preview. Further, in the present invention, a part of the text converted from the audio data can be set as a filename of the audio data.
Moreover, in the present invention, the converted texts can be displayed with an indication that each of the converted texts is distinguished by a talker. Furthermore, in the present invention, audio data can be searched for based on a text converted from the audio data. As a result, the present invention can enhance the convenience of the user in the management of audio data.
Referring to
The audio processor 160 is connected to a speaker SPK for outputting an audio signal transmitted or received during a communication, an audio signal included in a received message, an audio signal according to reproduction of audio data (or an audio file) stored in the storage unit 120, or an audio signal included in video data (or a video file), and to a microphone MIC for collecting a user's voice or other audio signals. The audio processor 160 may support a function of recording audio data.
The wireless communication unit 150 supports a wireless communication function of the electronic device 100. For example, the wireless communication unit 150 may include a short-range wireless communication module, such as a Bluetooth module, a ZigBee module, a Near Field Communication (NFC) module, or a Wireless Fidelity (Wi-Fi) module, when the electronic device 100 supports a short-range wireless communication, and may include a mobile communication module when the electronic device 100 supports a mobile communication function (for example, a function of a mobile communication with the 3G or 4G standards). Meanwhile, when the electronic device 100 does not support the Speech-To-Text (STT) conversion function and the talker classification function, the wireless communication unit 150 may transmit audio data to a server supporting the STT conversion function and the talker classification function and receive a result thereof (for example, a text converted from the audio data or preview data in the form of an image or a video) from the server.
The input unit 140 generates a key signal related to a user setting and a function control of the electronic device 100 and transfers the generated key signal to the controller 110. To this end, the input unit 140 may include various input keys and function keys for inputting of numerals or character information and setting of various functions. The function keys may include a direction key, a side key, and a shortcut key for execution of particular functions. The input unit 140 may be configured by one or a combination of a qwerty keypad, a 3*4 keypad, a 4*3 keypad, a ball joystick, an optical joystick, a wheel key, a touch key, a touch pad, a touch screen, and other input means. Also, when the electronic device 100 supports a full touch screen, the input unit 140 may include only a part of the function keys, such as a volume key, a power key, a menu key, a cancellation key, and a home key. In particular, the input unit 140 according to the present invention generates various input signals for controlling various procedures, including a preview of audio data, setting of a filename of audio data, changing of a filename of audio data, and searching of audio data, and transmits the generated input signals to the controller 110.
The touch screen 130 performs an input function and a display function. To this end, the touch screen 130 includes the display unit 131 and the touch detection unit 132.
The display unit 131 displays various menus of the electronic device 100, information input by the user, or information to be provided to the user. The display unit 131 may be formed by a Liquid Crystal Display (LCD), an Organic Light Emitted Diode (OLED), or an Active Matrix Organic Light Emitted Diode (AMOLED). The display unit 131 provides various screens for use of the electronic device 100, which include, for example, a home screen, a menu screen, and a phone-call screen. In particular, the display unit 131 according to an embodiment of the present invention displays a preview screen of audio data, a screen for generation of a filename of audio data, a screen for changing of a filename of audio data, and a screen for searching for audio data. Such various screens will be described below in detail with reference to the drawings illustrating examples of the screens.
The touch detection unit 132 is a device for providing an input function, and generates a touch event and transmits the generated touch event to the controller 110 when a touch input device, such as a user's finger, a stylus, or an electronic pen, contacts or approaches the touch detection unit 132. Specifically, the touch detection unit 132 detects the occurrence of a touch event through a change in physical properties (for example, a capacitance or a resistance value) according to contact or approaching of a touch input device, and transfers the touch event type (e.g. tap, drag, flick, long touch, double-touch, multi-touch, etc.) and touch location information of the occurred touch to the controller 110. The touch detection unit 132 as described above is obvious to those skilled in the art, so a detailed description thereof is omitted here.
The storage unit 120 stores an Operating System (OS) of the electronic device 100 and application programs necessary for other optional functions, such as a sound reproduction function, an image or video reproduction function, and a broadcast reproduction function. Further, the storage unit 120 stores various data, such as video data, game data, audio data, and movie data. In particular, the storage unit 120 according to the present invention stores a recording application, an STT conversion application for converting audio data to a text, and a talker classification application for classifying and/or recognizing a talker from audio data.
The storage unit 120 also stores preview data of audio data. The preview data may be generated and stored when new audio data is generated or audio data is downloaded. Also, in response to a request for a preview of audio data, the preview data may be generated if the audio data has no preview data. The preview data of the audio data may be formed as a text generated by converting the whole or at least a part of the audio data, or as an image (for example, a thumbnail image) or a video image (for example, a slide text in which the text moves in a predetermined direction) converted from the text. The preview data of the audio data may be generated by converting a predetermined section of the audio data from the start point of the audio data into a text. Otherwise, the preview data of the audio data may be generated by converting a predetermined section of the audio data from a position at which the reproduction of the audio data is stopped (or terminated) into a text. The predetermined section may be set by time (for example, 10 seconds) or the number of characters or letters (for example, 30 letters).
When the audio data includes a plurality of talkers, the storage unit 120 may store a talker classification program for classifying and displaying converted texts according to the talkers. The talker classification program classifies talkers through tone analysis. Alternatively, when a phone call is recorded, the talker classification program may classify talkers according to whether audio data is received through the microphone MIC or the wireless communication unit 150. For example, the talker classification program may classify audio data received through the microphone MIC as audio data of the user and audio data received through the wireless communication unit 150 as audio data of a counterpart. In this event, the talker classification program may recognize the counterpart through the phone number. Alternatively, the storage unit 120 may store a talker database for recognizing talkers. The talker database may store a tone property of each talker.
The controller 110 controls general operations of the electronic device 100 and a signal flow between internal blocks of the electronic device 100, and performs a data processing function for processing data. For example, the controller 110 may be configured by a Central Processing Unit (CPU), an application processor, etc. The controller 110 may be configured by a single core processor or a multi-core processor. The controller 110 controls a preview function of audio data. More specifically, the controller 110 converts at least a part of the audio data to a text through a voice recognition function, and generates and stores the converted text as preview data. Thereafter, the controller 110 identifies whether pre-stored preview data exists in response to a request for a preview of the audio data. When the pre-stored preview data exists, the controller 110 performs control such that the pre-stored preview data is displayed. Meanwhile, when the preview data does not exist, the controller 110 performs a process of generating the preview data of the audio data.
Further, the controller 110 controls a process of generating and changing a filename of the audio data. More specifically, when there is a request for generating or changing a filename of the audio data, the controller 110 converts a part of the audio data to a text and displays the converted text and sets a part of the text selected from the displayed text as the filename of the audio data which has been requested to be generated or changed.
Further, the controller 110 controls a process of searching for audio data. More specifically, after receiving a search word through a search word input window and converting the audio data to a text, the controller 110 identifies whether the converted text includes the search word. Thereafter, the controller 110 displays audio data including the search word as a search result.
In order to control the above described processes, the controller 110 includes the Speech-To-Text (STT) conversion unit 111 and the talker classification unit 112. The STT conversion unit 111 converts audio data to a text. That is, the SST conversion unit 111 supports a voice recognition function. The talker classification unit 112 is used for distinguishing converted texts by talkers, respectively, and displaying the distinguished converted texts. To this end, the talker classification unit 112 analyzes tones of the audio data to classify a plurality of talkers included in the audio data. Meanwhile, when the storage unit 120 stores the talker database for recognizing talkers, the talker classification unit 112 provides a talker recognition function. Through the talker recognition function, the controller 110 displays the converted texts classified according to respective talkers. Alternatively, in audio data generated by recording a phone call, the talker classification unit 112 recognizes a talker through a call counterpart's phone number.
Meanwhile, although not illustrated in
Referring to
The controller 110 determines whether pre-stored preview data exists in step 203. The pre-stored preview data may be a text converted from at least a part of the audio data, or an image or a video generated based on the text. The video may be an image in which a text moves in a predetermined direction at a predetermined speed, an image in which a text is output like an action of writing a letter by the user, or an image in which a text is changed in a predetermined unit (for example, 10 letters). However, the present invention is not limited thereto.
When the preview data exists in step 203, the controller 110 outputs preview data mapped into the audio data in step 215 and proceeds to step 213 described below. A method of displaying the preview data will be described below with reference to
The controller 110 displays the converted text in step 207. The controller 110 determines whether a request for storing the preview data is received in step 209. When the request for storing the preview data is received, the controller 110 stores the converted text as the preview data of the audio data in step 211. In contrast, when the request for storing the preview data is not received, the controller 110 proceeds to step 213 to determine whether the preview is terminated. When the preview is not terminated, the controller 110 may remain in step 213. In contrast, when the preview is terminated, the controller 110 terminates a process of displaying the preview of the audio data.
Meanwhile, although it has been described that the controller 110 determines whether the preview data exists in response to a request for the preview, generates the preview data of the audio data when the preview data does not exist, and stores the generated preview data according to a request for storing the preview data by the user, the present invention is not limited thereto. For example, when new audio data is generated (for example, recorded) or audio data is downloaded, the controller 110 may generate preview data of the new audio data or the downloaded audio data and automatically store the generated preview data. Further, the controller 110 may update the preview data based on a position where the audio data is reproduced last. For example, when a reproduction of the audio data is stopped or terminated during the reproduction, the controller 110 may convert a predetermined section (for example, time or the number of letters) of the audio data from a position where the reproduction is stopped or terminated to a text, so as to update preview data.
Referring to
Meanwhile, the present invention is not limited to displaying the converted texts distinguished by talkers, respectively. For example, when the electronic device 100 does not provide a talker classification function, the preview data may be displayed in a form where the converted texts are listed.
Referring to
As described above with reference to
Referring to
While the audio data is continuously recorded, the controller 110 receives a request for terminating the recording of the audio data in step 403. When receiving the request for terminating the recording, the controller 110 performs voice recognition on the recorded audio data in step 405, and converts voice-recognized contents to a text in step 407. For example, the controller 110 may control the STT conversion unit 111 such that the recorded audio data is converted to the text. At this time, the controller 110 performs a control such that audio data for a predetermined duration (for example, ten seconds) from a start position of the recorded audio data is converted to the text. Alternatively, the controller 110 may perform a control such that audio data in the predetermined duration of time from a time point when an audio signal that is greater than or equal to a predetermined level is detected is converted to the text. This is because the audio data may include a section in which only a noise signal is recorded without an input of an audio signal.
Next, the controller 110 determines whether to store the recorded audio data in step 409. At this time, the controller 110 may provide the converted text through a filename of the recorded audio data. For example, as indicated by screen 510 of
When it is determined to store the recorded audio data in step 409, the controller 110 stores the audio data in step 417 and output an audio data list screen in step 419. In contrast, when it is determined not to store the recorded audio data in step 409, the controller 110 determines whether a text conversion position of the audio data is changed in step 411. When the text conversion position of the audio data is changed, the controller 110 converts the audio data at the changed position to the text and displays the converted text in step 413. For example, as indicated by screen 520 of
Meanwhile, although it has been described that the text conversion position is changed by touching the visualized image 52 as indicated by the screen 520, the present invention is not limited thereto. For example, the text conversion position may be changed according to a movement of a reproduction bar 31 along a time line 32.
The controller 110 determines whether a part of the converted and displayed text is selected in step 415. When the part of the converted and displayed text is not selected, the controller 110 returns to step 411. In contrast, when the part of the converted and displayed text is selected, the controller 110 returns to step 409. For example, when a partial text 54 (“first step in Korea”) is selected in the speech bubble 53 by the user as indicated by a screen 530 of
The popup window 55 includes the selected partial text 54 (“first step in Korea”). When “YES” is selected in the popup window 55, the controller 110 outputs an audio data list screen as indicated by a screen 550 of
Meanwhile, although it has been described that the popup window 55 asking about whether to store the partial text 54 is directly output once the partial text 54 is selected in the speech bubble 53, the present invention is not limited thereto. For example, the controller 110 may output the popup window 55 when the partial text 54 is selected and then a storage command is received. Further, although it has been illustrated in
According to the aforementioned embodiments of the present invention, when new audio data is recorded and stored, contents of the recorded audio data can be preliminarily identified through a text, and a part of the text may be set as a filename. Accordingly, the user of the electronic device 100 according to an embodiment of the present invention can easily set a filename of new audio data.
Referring to
Thereafter, the controller 110 determines whether a request for changing a filename is received in step 605. The request for changing the filename of the particular audio data may be made through a menu or a preset touch input (for example, a long touch). When the request (command) for changing the filename is not received, the controller 110 performs a corresponding function in step 607. For example, the controller 110 may perform a function of reproducing, deleting, or moving the selected particular audio data or maintain a standby state according to a request of the user. In contrast, when the request (command) for changing the filename is received, the controller 110 converts the selected particular audio data to a text and displays the converted text in step 609. For example, when the request for changing the filename of particular audio data 71 is received in a list screen as indicated by a screen 710 of
The controller 110 then determines whether a text conversion position of the audio data is changed in step 611. When the text conversion position of the audio data is changed, the controller 110 converts the audio data at the changed position to the text and displays the converted text in step 613. For example, the controller 110 may convert audio data in a predetermined section from a position of the reproduction bar 31 on the time line 32 to the text and display the converted text on the expanded region 72.
The controller 110 then determines whether a part of the text displayed on the expanded region 72 is selected in step 615. When the part of the text displayed on the expanded region 72 is not selected, the controller 110 returns to step 611. In contrast, when the part of the text displayed on the expanded region 72 is selected, the controller 110 determines whether to store the part of the text in step 617. When it is determined not to store the part of the text, the controller 110 returns to step 611. In contrast, when it is determined to store the part of the text, the controller 110 changes a filename of the particular audio data 71 into the selected partial text in step 619. For example, when a partial text 73 (“raining outside”) is selected in the expanded region 72 by the user as indicated by a screen 730 of
Meanwhile, although it has been described that the popup window 74 asking about whether to store the partial text 73 is directly output once the partial text 73 is selected in the expanded region 72, the present invention is not limited thereto. For example, the controller 110 may output the popup window 74 asking about whether to store the partial text 73 when the partial text 73 is selected and then a storage command is received. Further, although it has been illustrated that the converted texts are listed and displayed on the expanded region 72, the converted texts may be displayed such that the converted texts are classified according to respective talkers as illustrated in
The user of the electronic device according to the aforementioned embodiments of the present invention can preliminarily identify contents of the audio data through a text when a filename of the audio data is changed and easily change the filename of the audio data by using a part of the text.
Referring to
The controller 110 receives a search request in step 807. When the search request is received, the controller 110 converts the audio data stored in the storage unit 120 to texts in step 809.
The controller 110 determines whether the search word is included in the converted texts to search for audio data in step 811, and then displays a search result in step 813.
Meanwhile, although it has been described that the audio data is converted to the text and then audio files including the search word are found when a search request is made, the text converted from the recorded audio data may be mapped with corresponding audio data and pre-stored in order to increase a search efficiency of the audio data in another embodiment of the present invention. In this case, step 809 may be omitted.
The user of the electronic device 100 according to the aforementioned embodiments of the present invention can search for audio data based on actual contents of the audio data, not a filename of the audio data through a search function. Accordingly, the user can easily search for audio data including contents which the user desires.
Although it has been described that only audio data is found when the search request is made, the present invention is not limited thereto. For example, in another embodiment of the present invention, other data (for example, a document file, an image file, and the like) including the search word may be also found in a search mode. Alternatively, in yet another embodiment of the present invention, a menu for setting whether to search for audio data is provided, and the audio data may be found when the electronic device is set to search for the audio data.
In addition, although the embodiments of the present invention have described the audio data as an example, the present invention may be applied to video data including both audio data and image data. When the present invention is applied to the video data, the electronic device may provide at least one of an image and a text in a preview form.
The method of managing audio data of the electronic device according to the embodiments of the present invention may be implemented in a form of a program command which can be executed through various computer means and be recorded in a computer-readable recording medium. The computer-readable recording medium may include a program command, a data file, and a data structure alone or a combination thereof. Meanwhile, the program command recorded in the recording medium is specially designed and configured for the present invention, but may be used after being known to those skilled in computer software fields. The computer-readable recording medium includes magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as a Compact Disc Read-Only Memory (CD-ROM) and a Digital Versatile Disc (DVD), magneto-optical media such as floptical disks, and hardware devices such as a Read-Only Memory (ROM), a Random Access Memory (RAM) and a flash memory, which are specially configured to store and perform program commands. Further, the program command includes a machine language code generated by a compiler and a high-level language code executable by a computer through an interpreter and the like. The hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention.
Although the method and the apparatus for managing audio data of the electronic device according to the embodiments of the present invention have been described through the specification and drawings by using the specific terms, the embodiments and the terms are merely used as general meanings to easily describe technical contents of the present invention and assist understanding of the present invention, and the present invention is not limited to the embodiments. That is, it is apparent to those skilled in the art that other various embodiments based on the technical idea of the present invention can be implemented.
Number | Date | Country | Kind |
---|---|---|---|
10-2013-0057369 | May 2013 | KR | national |
This application claims priority under 35 U.S.C. §119(a) to Korean Patent Application No. 10-2013-0057369, filed on May 21, 2013, the entire contents of which are incorporated herein by reference.