This application claims all benefits accruing under 35 U.S.C. §119 from Taiwan Patent Application No. 102116969, filed on May 14, 2013 in the Taiwan Intellectual Property Office. The contents of the Taiwan Application are hereby incorporated by reference.
1. Technical Field
The disclosure generally relates to voice processing technologies, and particularly relates to voice recording systems and methods.
2. Description of Related Art
More and more electronic devices, such as notebook computers, tablet computers, and smart phones, are designed to support voice recording functions. However, the voices recorded by these electronic devices don't have sufficient high quality to meet the requirements of high definition voices.
Therefore, there is room for improvement within the art.
Many aspects of the embodiments can be better understood with reference to the following drawings. The components in the drawings are not necessarily drawn to scale, the emphasis instead being placed upon clearly illustrating the principles of the embodiments. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the views.
The disclosure is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like reference numerals indicate similar elements. It should be noted that references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references can mean “at least one.”
In general, the word “module,” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions, written in a programming language such as Java, C, or assembly. One or more software instructions in the modules may be embedded in firmware, such as in an erasable-programmable read-only memory (EPROM). The modules described herein may be implemented as either software and/or hardware modules and may be stored in any type of non-transitory computer-readable medium or other storage device. Some non-limiting examples of non-transitory computer-readable media are compact discs (CDs), digital versatile discs (DVDs), Blu-Ray discs, Flash memory, and hard disk drives.
The processor 101 may be implemented or performed with a general purpose processor, a content addressable memory, a digital signal processor, an application specific integrated circuit, a field programmable gate array, any suitable programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination designed to perform the functions described here.
The memory 102 may be realized as RAM memory, flash memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. The memory 102 is coupled to the processor 101 such that the processor 101 can read information from, and write information to, the memory 102. The memory 102 can be used to store computer-executable instructions. The computer-executable instructions, when read and executed by the processor 101, cause the electronic device 10 to perform certain tasks, operations, functions, and processes described in more detail herein.
The user interface 103 may include or cooperate with various features to allow a user to interact with the electronic device 10. Accordingly, the user interface 103 may include various human-to-machine interfaces, e.g., a keypad, keys, a keyboard, buttons, switches, knobs, a touchpad, a joystick, a pointing device, a virtual writing tablet, a touch screen, or any device, component, or function that enables the user to select options, input information, or otherwise control the operation of the electronic device 10. In various embodiments, the user interface 103 may include one or more graphical user interface (GUI) control elements that enable a user to manipulate or otherwise interact with an application via the display 106.
The two microphones 104 may receive sound and convert the sound into electronic signals, which can be stored and processed in a computing device.
The camera 105 may records images. The images may be photographs or moving images such as videos or movies. The camera 105 may be used to detect a user in front and recognize the face of the user.
The display 106 is suitably configured to enable the electronic device 10 to render and display various screens, GUIs, GUI control elements, drop down menus, auto-fill fields, text entry fields, message fields, or the like. Of course, the display 106 may also be utilized for the display of other information during the operation of the electronic device 10, as is well understood.
The voice recording system 20 may be implemented using software, firmware, and computer programming technologies.
The electronic device 10 may be realized in any common form factor including, without limitation: a desktop computer, a mobile computer (e.g., a tablet computer, a laptop computer, or a netbook computer), a smartphone, a video game device, a digital media player, or the like.
The space dividing module 201 divides the space in front of the camera 104 into a plurality of imaginary cubic areas. For example, the space in front of the camera 104 is divided into 27 (3 by 3 by 3) imaginary cubic areas as shown in
The delay calculating module 202 may calculate a delay parameter for each of the plurality of imaginary cubic areas and associate each imaginary cubic area with the corresponding delay parameter. A delay parameter represents a difference between time for sound to travel from an imaginary cubic area to one of the two microphones 104 and time for sound to travel from the imaginary cubic area to another one of the two microphones. As shown in
where Δ is the delay parameter, D1 is a distance between the imaginary cubic area and one of the two microphones 104, D2 is a distance between the imaginary cubic area and another one of the two microphones 104, and C is the speed of sound.
The user detecting module 203 may instruct the camera 105 to detect whether multiple users appear in front of the camera 105.
When multiple users are detected in front of the camera 105, the user selecting module 204 may recognize mouth gestures of each of the multiple users and select a user whose mouth gestures are the most tremendous among the multiple users.
The imaginary cubic area determining module 205 may instruct the camera 105 to locate the face of the selected user and determine an imaginary cubic area in which the face is located among from the plurality of imaginary cubic areas.
The wave beam calculating module 206 may calculate a wave beam pointing to the imaginary cubic area according to the delay parameter associated with the imaginary cubic area.
The voice recording module 207 may instruct the two microphones 104 to record voices inside the range of the wave beam and suppress noises outside of the range of the wave beam.
The voice monitoring module 208 may monitor whether a difference between voices recorded by the two microphones 104 exceeds a predetermined threshold.
When the difference between voices recorded by the two microphones 104 exceeds the predetermined threshold, the wave beam recalculating module 209 may recalculate the wave beam pointing to the imaginary cubic area by applying a particle swarm optimization (PSO) algorithm.
In step S601, the space dividing module 201 divides the space in front of the camera 104 into a plurality of imaginary cubic areas.
In step S602, the delay calculating module 202 calculates a delay parameter for each of the plurality of imaginary cubic areas and associates each imaginary cubic area with the corresponding delay parameter.
In step S603, the user detecting module 203 instructs the camera 105 to detect whether multiple users appear in front of the camera 105. If multiple users are detected in front of the camera 105, the flow proceeds to step S604, otherwise, the flow proceeds to step S605.
In step S604, the user selecting module 204 recognizes mouth gestures of each users and selects a user whose mouth gestures are the most tremendous among the users.
In step S605, the imaginary cubic area determining module 205 instructs the camera 105 to locate the face of the selected user.
In step S606, the imaginary cubic area determining module 205 determines an imaginary cubic area in which the face is located among from the plurality of imaginary cubic areas.
In step S607, the wave beam calculating module 206 calculates a wave beam pointing to the imaginary cubic area according to the delay parameter associated with the imaginary cubic area.
In step S608, the voice recording module 207 instructs the two microphones 104 to record voices within a range of the wave beam and to suppress noises outside of the range of the wave beam.
In step S609, the voice monitoring module 208 monitors whether a difference between voices recorded by the two microphones 104 exceeds a predetermined threshold. If the difference between voices recorded by the two microphones 104 exceeds the predetermined threshold, the flow proceeds to step S610, otherwise, the flow ends.
In step S610, the wave beam recalculating module 209 recalculates the wave beam pointing to the imaginary cubic area by applying a PSO algorithm.
In step S611, the voice recording module 207 instructs the two microphones 104 to record voices inside the range of the recalculated wave beam.
Although numerous characteristics and advantages have been set forth in the foregoing description of embodiments, together with details of the structures and functions of the embodiments, the disclosure is illustrative only, and changes may be made in detail, especially in the matters of arrangement of parts within the principles of the disclosure to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed.
In particular, depending on the embodiment, certain steps or methods described may be removed, others may be added, and the sequence of steps may be altered. The description and the claims drawn for or in relation to a method may give some indication in reference to certain steps. However, any indication given is only to be viewed for identification purposes, and is not necessarily a suggestion as to an order for the steps.
Number | Date | Country | Kind |
---|---|---|---|
102116969 | May 2013 | TW | national |