This application claims priority to Chinese Patent Application No. 201710766457.7, filed on Aug. 30, 2017, the contents of which are incorporated by reference herein.
The subject matter herein generally relates to voice control of electronic devices.
Digital cameras or smartphones may capture people's voices or people's videos in noisy environments. Thus, obtained audio or video may include noise.
Implementations of the present technology will now be described, by way of example only, with reference to the attached figures.
It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the exemplary embodiments described herein. However, it will be understood by those of ordinary skill in the art that the exemplary embodiments described herein can be practiced without these specific details. In other instances, methods, procedures, and components have not been described in detail so as not to obscure the related relevant feature being described. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features. The description is not to be considered as limiting the scope of the exemplary embodiments described herein.
A definition that applies throughout this disclosure will now be presented.
The term “comprising” means “including, but not necessarily limited to”; it specifically indicates open-ended inclusion or membership in a so-described combination, group, series, and the like.
The first storage unit 20 is configured to store at least one first voice. The voice acquiring unit 22 is configured to acquire a second voice. The voice analyzing unit 24 is configured to analyze features of the first voice and the second voice. Such features include timbre, tone and loudness. The determining unit 26 is configured to determine whether there are sounds different from the features of the first voice in the second voice. The voice filtering unit 28 is configured to filter the differences from the features of the first voice in the second voice. In the embodiment, the voice filtering system 100 further includes a second storage unit 30. The second storage unit 30 is configured to store the filtered second voice.
In an exemplary embodiment, the first storage unit 20 prestores a number of first voices. The voice analyzing unit 24 is configured to analyze features of each first voice and the second voice. The voice filtering unit 28 is configured to filter the differences in the second voice from the features of the number of the first voices.
In an exemplary embodiment, the voice acquiring unit 22 includes a microphone and the microphone captures the second voice. In another exemplary embodiment, the second voice is transmitted from another electronic device communicating with the electronic device.
The voice filtering system 100 further includes a selecting unit 38. The selecting unit 38 is configured to select at least one target person in the image. The first voice is the voice of the target person.
The selecting unit 38 includes a touch sensing unit 40 and a people determining unit 42. The touch sensing unit 40 is configured to sense a touch position on the image. When there is no second touch in a preset time after sensing a first touch, the people determining unit 42 determines a part of image corresponding to the touch position as containing the target person. The preset time is two seconds.
The voice filtering system 100 further includes a labeling unit 44. The selecting unit 38 selects one target person. When the second voice is a voice of the target person, the labeling unit 44 labels the target person in the image with a first mark. The first mark is a flashing box.
When the touch sensing unit 40 senses the second touch, the labeling unit 44 further labels the part of the image with a second mark. The second mark can be a circle or a box.
At block 410, the first storage unit 20 prestores at least one first voice.
At block 420, the voice acquiring unit 22 acquires a second voice.
At block 425, the voice analyzing unit 24 analyzes features of each first voice and the second voice.
At block 430, the determining unit 26 determines whether there is a difference in the features of the second voice from the first voice.
At block 440, the voice filtering unit 28 filters out such differences.
At block 450, the second storage unit 30 stores the filtered second voice.
At block 402, the switch unit 32 activates a voice filtering function of the voice filtering unit 28. When the voice filtering function is activated, the different features in the second voice compared to the first voice can be filtered out.
At block 404, the image capturing unit 34 captures images.
At block 460, the video generating unit 36 generates a video of the image and the second voice.
At block 406, the selecting unit 38 selects target person on the image. The first voice is a voice of the target person.
At block 470, when the second voice is a voice of the target person, the labeling unit 44 labels the target person on the image with a first mark.
At block 405, the touch sensing unit 40 senses a touch position on the image.
At block 407, when there is no second touch in a preset time after sensing a block 405 touch, the people determining unit 42 determines that a part of an image corresponding to the touch position is the target person.
The exemplary embodiments shown and described above are only examples. Even though numerous dataistics and advantages of the present technology have been set forth in the foregoing description, together with details of the structure and function of the present disclosure, the disclosure is illustrative only, and changes may be made in the details, including in matters of shape, size, and arrangement of the parts within the principles of the present disclosure, up to and including the full extent established by the broad general meaning of the terms used in the claims.
Number | Date | Country | Kind |
---|---|---|---|
201710766457.7 | Aug 2017 | CN | national |