DISPLAY DEVICE

Information

  • Patent Application
  • 20250103280
  • Publication Number
    20250103280
  • Date Filed
    January 26, 2022
    3 years ago
  • Date Published
    March 27, 2025
    a month ago
Abstract
The present disclosure provides a display device and a method of operating the same that can improve voice recognition performance, includes a speaker, a wireless communication interface for communicating with a peripheral device, a microphone for recording surrounding sounds, and a controller that performs voice recognition from data recorded by the microphone, and the controller can supplement user voice recognition using audio data received from a peripheral device.
Description
TECHNICAL FIELD

This disclosure relates to a display device, and more specifically, to voice recognition in a display device.


BACKGROUND ART

Digital TV services using wired or wireless communication networks are becoming common. The digital TV services may provide various services that cannot be provided by existing analog broadcasting services.


For example, in the case of IPTV (Internet Protocol Television) and smart TV services, which are types of digital TV services, interactivity is provided so that users can actively select the types of programs to watch, the viewing time, and the like. IPTV and smart TV services may provide various additional services, such as Internet search, home shopping, online games, etc., based on such interactivity.


Additionally, recently, with the development of voice recognition technology, display devices provide a function that allows users to control them with voice. Accordingly, it has become possible for users to more easily control the display by simply uttering commands.


However, there is a problem in which the command is not properly recognized due to surrounding noise including the sound of the video being played on the display device, while the display device is receiving a command. In addition, there is a problem that the command recognition rate is lowered due to various factors, such as the user uttering the command being located far away from the display device.


DISCLOSURE
Technical Problem

The present disclosure seeks to provide a display device and a method of operating the same that can improve voice recognition performance.


The present disclosure seeks to provide a display device and a method of operating the same that can improve voice preprocessing performance.


Technical Solution

A display device according to an embodiment of the present disclosure seeks to recognize user voice by considering audio recorded or played back from a peripheral device.


A display device according to an embodiment of the present disclosure includes a speaker, a wireless communication interface that communicates with a peripheral device, a microphone that records surrounding sounds, and a controller that performs user voice recognition from data recorded by the microphone, and the controller can compensate user voice recognition using audio data received from the peripheral device.


Audio data may include at least one of recording data recorded by a peripheral device and sound source data being played on the peripheral device.


If the display device is not linked to a peripheral device, the controller recognizes the user voice from the data recorded by the microphone using first preprocessing data, and if the display device is linked to the peripheral device, the controller obtains second preprocessing data based on audio data received from the peripheral device, and recognizes the user voice from the data recorded by the microphone using the second preprocessing data.


When movement of the peripheral device is detected, the controller can receive audio data again from the peripheral device whose movement was detected and modify the second preprocessing data.


The controller can detect the movement of the peripheral device based on changes in a sensing value of an acceleration sensor provided in the peripheral device.


The controller can detect movement of peripheral devices based on detection of a decrease in preprocessing performance.


If the location of the peripheral device cannot be recognized due to the movement of the peripheral device, the controller can recognize the user voice from the data recorded by the microphone using first preprocessing data.


When a decrease in communication speed is detected, the controller can recognize the user voice from the data recorded by the microphone using first preprocessing data.


The controller may synchronize the position and delay time with the peripheral device based on at least one of sound source data being output through a speaker, data recorded by a microphone, and audio data received from the peripheral device.


If there is no sound source output through the speaker or synchronization is not performed, the controller can synchronize the position and delay time by outputting a sample sound corresponding to the inaudible frequency band.


The controller can calculate the location of the peripheral device by adjusting the beamforming angle of the microphone.


The peripheral device may include at least one of a remote control device, a mobile terminal, and a Bluetooth speaker that transmits a control signal to the display device.


The controller can extract user commands from each of the data recorded by the microphone and the data recorded by the peripheral device, and recognize the largest command among the extracted user commands as the user voice.


The controller can extract user commands from each of the data recorded by the microphone and the data recorded by the peripheral device, and recognize user commands that match at least two of the extracted user commands as user voice.


When the wakeup word is recognized, the controller can compensate user voice recognition using audio data received from the peripheral device.


Advantageous Effects

According to an embodiment of the present disclosure, it is possible to more accurately distinguish user voice from surrounding noise through audio recorded or played back in a peripheral device, which has the advantage of improving voice recognition performance.


According to an embodiment of the present disclosure, there is an advantage in that preprocessing performance for voice recognition is improved by synchronizing the location or delay time of peripheral devices.





DESCRIPTION OF DRAWINGS


FIG. 1 is a block diagram illustrating a configuration of a display device according to an embodiment of the present disclosure.



FIG. 2 is a block diagram of a remote control device according to an embodiment of the present disclosure.



FIG. 3 shows an example of an actual configuration of a remote control device according to an embodiment of the present disclosure.



FIG. 4 shows an example of using a remote control device according to an embodiment of the present disclosure.



FIG. 5 is a diagram showing a display device and a peripheral device according to an embodiment of the present disclosure.



FIG. 6 is a flowchart showing an operating method of a display device according to an embodiment of the present disclosure.



FIG. 7 is a flow chart illustrating a method by which a display device performs voice recognition compensation according to the first embodiment of the present disclosure.





BEST MODE

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. The suffixes “module” and “unit or portion” for components used in the following description are merely provided only for facilitation of preparing this specification, and thus they are not granted a specific meaning or function.


A display device according to an embodiment of the present disclosure is, for example, an intelligent display device in which a computer support function is added to a broadcast reception function, and may have an easy-to-use interface such as a handwritten input device, a touch screen, a spatial remote control, or the like since an Internet function is added while fulfilling the broadcast receiving function. In addition, it is connected to the Internet and a computer with the support of a wired or wireless Internet function, so that functions such as e-mail, web browsing, banking, or games can also be performed. A standardized general-purpose OS may be used for these various functions.


Accordingly, in the display device described in the present disclosure, various user-friendly functions can be performed because various applications can be freely added or deleted, for example, on a general-purpose OS kernel. More specifically, the display device may be, for example, a network TV, HBBTV, smart TV, LED TV, OLED TV, and the like, and may be applied to a smart phone in some cases.



FIG. 1 is a block diagram showing a configuration of a display device according to an embodiment of the present disclosure.


Referring to FIG. 1, a display device 100 may include a broadcast receiver 130, an external device interface 135, a memory 140, a user input interface 150, a controller 170, a wireless communication interface 173, a microphone 175, a display 180, a speaker 185, and a power supply circuit 190.


The broadcast receiver 130 may include a tuner 131, a demodulator 132, and a network interface 133.


The tuner 131 may select a specific broadcast channel according to a channel selection command. The tuner 131 may receive a broadcast signal for the selected specific broadcast channel.


The demodulator 132 may separate the received broadcast signal into an image signal, an audio signal, and a data signal related to a broadcast program, and restore the separated image signal, audio signal, and data signal to a format capable of being output.


The external device interface 135 may receive an application or a list of applications in an external device adjacent thereto, and transmit the same to the controller 170 or the memory 140.


The external device interface 135 may provide a connection path between the display device 100 and an external device. The external device interface 135 may receive one or more of images and audio output from an external device connected to the display device 100 in a wired or wireless manner, and transmit the same to the controller 170. The external device interface 135 may include a plurality of external input terminals. The plurality of external input terminals may include an RGB terminal, one or more High Definition Multimedia Interface (HDMI) terminals, and a component terminal.


The image signal of the external device input through the external device interface 135 may be output through the display 180. The audio signal of the external device input through the external device interface 135 may be output through the speaker 185.


The external device connectable to the external device interface 135 may be any one of a set-top box, a Blu-ray player, a DVD player, a game machine, a sound bar, a smartphone, a PC, a USB memory, and a home theater, but this is only an example.


The network interface 133 may provide an interface for connecting the display device 100 to a wired/wireless network including an Internet network. The network interface 133 may transmit or receive data to or from other users or other electronic devices through a connected network or another network linked to the connected network.


In addition, a part of content data stored in the display device 100 may be transmitted to a selected user among a selected user or a selected electronic device among other users or other electronic devices registered in advance in the display device 100.


The network interface 133 may access a predetermined web page through the connected network or the other network linked to the connected network. That is, it is possible to access a predetermined web page through a network, and transmit or receive data to or from a corresponding server.


In addition, the network interface 133 may receive content or data provided by a content provider or a network operator. That is, the network interface 133 may receive content such as movies, advertisements, games, VOD, and broadcast signals and information related thereto provided from a content provider or a network provider through a network.


In addition, the network interface 133 may receive update information and update files of firmware provided by the network operator, and may transmit data to an Internet or content provider or a network operator.


The network interface 133 may select and receive a desired application from among applications that are open to the public through a network.


The memory 140 may store programs for signal processing and control of the controller 170, and may store images, audio, or data signals, which have been subjected to signal-processed.


In addition, the memory 140 may perform a function for temporarily storing images, audio, or data signals input from an external device interface 135 or the network interface 133, and store information on a predetermined image through a channel storage function.


The memory 140 may store an application or a list of applications input from the external device interface 135 or the network interface 133.


The display device 100 may play a content file (a moving image file, a still image file, a music file, a document file, an application file, or the like) stored in the memory 140 and provide the same to the user.


The user input interface 150 may transmit a signal input by the user to the controller 170 or a signal from the controller 170 to the user. For example, the user input interface 150 may receive and process a control signal such as power on/off, channel selection, screen settings, and the like from the remote control device 200 in accordance with various communication methods, such as a Bluetooth communication method, a WB (Ultra Wideband) communication method, a ZigBee communication method, an RF (Radio Frequency) communication method, or an infrared (IR) communication method or may perform processing to transmit the control signal from the controller 170 to the remote control device 200.


In addition, the user input interface 150 may transmit a control signal input from a local key (not shown) such as a power key, a channel key, a volume key, and a setting value to the controller 170.


The image signal image-processed by the controller 170 may be input to the display 180 and displayed as an image corresponding to a corresponding image signal. Also, the image signal image-processed by the controller 170 may be input to an external output device through the external device interface 135.


The audio signal processed by the controller 170 may be output to the speaker 185. Also, the audio signal processed by the controller 170 may be input to the external output device through the external device interface 135.


In addition, the controller 170 may control the overall operation of the display device 100.


In addition, the controller 170 may control the display device 100 by a user command input through the user input interface 150 or an internal program and connect to a network to download an application a list of applications or applications desired by the user to the display device 100.


The controller 170 may allow the channel information or the like selected by the user to be output through the display 180 or the speaker 185 along with the processed image or audio signal.


In addition, the controller 170 may output an image signal or an audio signal through the display 180 or the speaker 185, according to a command for playing an image of an external device through the user input interface 150, the image signal or the audio signal being input from an external device, for example, a camera or a camcorder, through the external device interface 135.


Meanwhile, the controller 170 may allow the display 180 to display an image, for example, allow a broadcast image which is input through the tuner 131 or an external input image which is input through the external device interface 135, an image which is input through the network interface or an image which is stored in the memory 140 to be displayed on the display 180. In this case, an image being displayed on the display 180 may be a still image or a moving image, and may be a 2D image or a 3D image.


In addition, the controller 170 may allow content stored in the display device 100, received broadcast content, or external input content input from the outside to be played, and the content may have various forms such as a broadcast image, an external input image, an audio file, still images, accessed web screens, and document files.


The wireless communication interface 173 may communicate with an external device through wired or wireless communication. The wireless communication interface 173 may perform short range communication with an external device. To this end, the wireless communication interface 173 may support short range communication using at least one of Bluetooth™, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra Wideband (UWB), ZigBee, Near Field Communication (NFC), Wi-Fi (Wireless-Fidelity), Wi-Fi (Wireless-Fidelity), Wi-Fi Direct, and Wireless USB (Wireless Universal Serial Bus) technologies. The wireless communication interface 173 may support wireless communication between the display device 100 and a wireless communication system, between the display device 100 and another display device 100, or between the display device 100 and a network in which the display device 100 (or an external server) is located through wireless area networks. The wireless area networks may be wireless personal area networks.


Here, another display device 100 may be a wearable device (e.g., a smartwatch, smart glasses or a head mounted display (HMD), a mobile terminal such as a smart phone, which is able to exchange data (or interwork) with the display device 100 according to the present disclosure. The wireless communication interface 173 may detect (or recognize) a wearable device capable of communication around the display device 100.


Furthermore, when the detected wearable device is an authenticated device to communicate with the display device 100 according to the present disclosure, the controller 170 may transmit at least a portion of data processed by the display device 100 to the wearable device through the wireless communication interface 173. Therefore, a user of the wearable device may use data processed by the display device 100 through the wearable device.


The microphone 175 can acquire audio. The microphone 175 may include at least one microphone (not shown), and may acquire audio around the display device 100 through the microphone (not shown).


The display 180 may convert image signals, data signals, and OSD signals processed by the controller 170, or image signals or data signals received from the external device interface 135 into R, G, and B signals, and generate drive signals.


Meanwhile, since the display device 100 shown in FIG. 1 is only an embodiment of the present disclosure, some of the illustrated components may be integrated, added, or omitted depending on the specification of the display device 100 that is actually implemented.


That is, two or more components may be combined into one component, or one component may be divided into two or more components as necessary. In addition, a function performed in each block is for describing an embodiment of the present disclosure, and its specific operation or device does not limit the scope of the present disclosure.


According to another embodiment of the present disclosure, unlike the display device 100 shown in FIG. 1, the display device 100 may receive an image through the network interface 133 or the external device interface 135 without a tuner 131 and a demodulator 132 and play the same.


For example, the display device 100 may be divided into an image processing device, such as a set-top box, for receiving broadcast signals or content according to various network services, and a content playback device that plays content input from the image processing device.


In this case, an operation method of the display device according to an embodiment of the present disclosure will be described below may be implemented by not only the display device 100 as described with reference to FIG. 1 and but also one of an image processing device such as the separated set-top box and a content playback device including the display 180 and the speaker 185.


Next, a remote control device according to an embodiment of the present disclosure will be described with reference to FIGS. 2 to 3.



FIG. 2 is a block diagram of a remote control device according to an embodiment of the present disclosure, and FIG. 3 shows an actual configuration example of a remote control device 200 according to an embodiment of the present disclosure.


First, referring to FIG. 2, the remote control device 200 may include a fingerprint reader 210, a wireless communication circuit 220, a user input interface 230, a sensor 240, an output interface 250, a power supply circuit 260, a memory 270, a controller 280, and a microphone 290.


Referring to FIG. 2, the wireless communication circuit 220 may transmit and receive signals to and from any one of display devices according to embodiments of the present disclosure described above.


The remote control device 200 may include an RF circuit 221 capable of transmitting and receiving signals to and from the display device 100 according to the RF communication standard, and an IR circuit 223 capable of transmitting and receiving signals to and from the display device 100 according to the IR communication standard. In addition, the remote control device 200 may include a Bluetooth circuit 225 capable of transmitting and receiving signals to and from the display device 100 according to the Bluetooth communication standard. In addition, the remote control device 200 may include an NFC circuit 227 capable of transmitting and receiving signals to and from the display device 100 according to the NFC (near field communication) communication standard, and a WLAN circuit 229 capable of transmitting and receiving signals to and from the display device 100 according to the wireless LAN (WLAN) communication standard.


In addition, the remote control device 200 may transmit a signal containing information on the movement of the remote control device 200 to the display device 100 through the wireless communication circuit 220.


In addition, the remote control device 200 may receive a signal transmitted by the display device 100 through the RF circuit 221, and transmit a command regarding power on/off, channel change, volume adjustment, or the like to the display device 100 through the IR circuit 223 as necessary.


The user input interface 230 may include a keypad, a button, a touch pad, a touch screen, or the like. The user may input a command related to the display device 100 to the remote control device 200 by operating the user input interface 230. When the user input interface 230 includes a hard key button, the user may input a command related to the display device 100 to the remote control device 200 through a push operation of the hard key button. Details will be described with reference to FIG. 3.


Referring to FIG. 3, the remote control device 200 may include a plurality of buttons. The plurality of buttons may include a fingerprint recognition button 212, a power button 231, a home button 232, a live button 233, an external input button 234, a volume control button 235, a voice recognition button 236, a channel change button 237, an OK button 238, and a back-play button 239.


The fingerprint recognition button 212 may be a button for recognizing a user's fingerprint. In one embodiment, the fingerprint recognition button 212 may enable a push operation, and thus may receive a push operation and a fingerprint recognition operation.


The power button 231 may be a button for turning on/off the power of the display device 100.


The home button 232 may be a button for moving to the home screen of the display device 100.


The live button 233 may be a button for displaying a real-time broadcast program.


The external input button 234 may be a button for receiving an external input connected to the display device 100.


The volume control button 235 may be a button for adjusting the level of the volume output by the display device 100.


The voice recognition button 236 may be a button for receiving a user's voice and recognizing the received voice.


The channel change button 237 may be a button for receiving a broadcast signal of a specific broadcast channel.


The OK button 238 may be a button for selecting a specific function, and the back-play button 239 may be a button for returning to a previous screen.


A description will be given referring again to FIG. 2.


When the user input interface 230 includes a touch screen, the user may input a command related to the display device 100 to the remote control device 200 by touching a soft key of the touch screen. In addition, the user input interface 230 may include various types of input means that may be operated by a user, such as a scroll key or a jog key, and the present embodiment does not limit the scope of the present disclosure.


The sensor 240 may include a gyro sensor 241 or an acceleration sensor 243, and the gyro sensor 241 may sense information regarding the movement of the remote control device 200.


For example, the gyro sensor 241 may sense information about the operation of the remote control device 200 based on the x, y, and z axes, and the acceleration sensor 243 may sense information about the moving speed of the remote control device 200. Meanwhile, the remote control device 200 may further include a distance measuring sensor to sense the distance between the display device 100 and the display 180.


The output interface 250 may output an image or audio signal corresponding to the operation of the user input interface 230 or a signal transmitted from the display device 100.


The user may recognize whether the user input interface 230 is operated or whether the display device 100 is controlled through the output interface 250.


For example, the output interface 450 may include an LED 251 that emits light, a vibrator 253 that generates vibration, a speaker 255 that outputs sound, or a display 257 that outputs an image when the user input interface 230 is operated or a signal is transmitted and received to and from the display device 100 through the wireless communication interface 225.


In addition, the power supply circuit 260 may supply power to the remote control device 200, and stop power supply when the remote control device 200 has not moved for a predetermined time to reduce power consumption.


The power supply circuit 260 may restart power supply when a predetermined key provided in the remote control device 200 is operated.


The memory 270 may store various types of programs and application data required for control or operation of the remote control device 200.


When the remote control device 200 transmits and receives signals wirelessly through the display device 100 and the RF circuit 221, the remote control device 200 and the display device 100 transmit and receive signals through a predetermined frequency band.


The controller 280 of the remote control device 200 may store and refer to information on a frequency band capable of wirelessly transmitting and receiving signals to and from the display device 100 paired with the remote control device 200 in the memory 270.


The controller 280 may control all matters related to the control of the remote control device 200. The controller 280 may transmit a signal corresponding to a predetermined key operation of the user input interface 235 or a signal corresponding to the movement of the remote control device 200 sensed by the sensor 240 through the wireless communication interface 225.


Also, the microphone 290 of the remote control device 200 may obtain a speech.


A plurality of microphones 290 may be provided.


Next, a description will be given referring to FIG. 4.



FIG. 4 shows an example of using a remote control device according to an embodiment of the present disclosure.


In FIG. 4, (a) illustrates that a pointer 205 corresponding to the remote control device 200 is displayed on the display 180.


The user may move or rotate the remote control device 200 up, down, left and right. The pointer 205 displayed on the display 180 of the display device 100 may correspond to the movement of the remote control device 200. As shown in the drawings, the pointer 205 is moved and displayed according to movement of the remote control device 200 in a 3D space, so the remote control device 200 may be called a space remote control device.


In (b) of FIG. 4, it is illustrated that that when the user moves the remote control device 200 to the left, the pointer 205 displayed on the display 180 of the display device 100 moves to the left correspondingly.


Information on the movement of the remote control device 200 detected through a sensor of the remote control device 200 is transmitted to the display device 100. The display device 100 may calculate the coordinates of the pointer 205 based on information on the movement of the remote control device 200. The display device 100 may display the pointer 205 to correspond to the calculated coordinates.


In (c) of FIG. 4, it is illustrated that a user moves the remote control device 200 away from the display 180 while pressing a specific button in the remote control device 200. Accordingly, a selected area in the display 180 corresponding to the pointer 205 may be zoomed in and displayed enlarged.


Conversely, when the user moves the remote control device 200 to be close to the display 180, the selected area in the display 180 corresponding to the pointer 205 may be zoomed out and displayed reduced.


On the other hand, when the remote control device 200 moves away from the display 180, the selected area may be zoomed out, and when the remote control device 200 moves to be close to the display 180, the selected area may be zoomed in.


Also, in a state in which a specific button in the remote control device 200 is being pressed, recognition of up, down, left, or right movements may be excluded. That is, when the remote control device 200 moves away from or close to the display 180, the up, down, left, or right movements are not recognized, and only the forward and backward movements may be recognized. In a state in which a specific button in the remote control device 200 is not being pressed, only the pointer 205 moves according to the up, down, left, or right movements of the remote control device 200.


Meanwhile, the movement speed or the movement direction of the pointer 205 may correspond to the movement speed or the movement direction of the remote control device 200.


Meanwhile, in the present specification, a pointer refers to an object displayed on the display 180 in response to an operation of the remote control device 200. Accordingly, objects of various shapes other than the arrow shape shown in the drawings are possible as the pointer 205. For example, the object may be a concept including a dot, a cursor, a prompt, a thick outline, and the like. In addition, the pointer 205 may be displayed corresponding to any one point among points on a horizontal axis and a vertical axis on the display 180, and may also be displayed corresponding to a plurality of points such as a line and a surface.



FIG. 5 is a diagram showing a display device and a peripheral device according to an embodiment of the present disclosure.


At least one peripheral devices 301, 302, 303 may be located around the display device 100. The peripheral devices 301, 302, 303 may include at least one microphone and at least one wireless communication module. The peripheral devices 301, 302, 303 includes a Bluetooth speaker 301, a mobile terminal 302 such as a smartphone, etc., and a remote control device 303 that transmits a control signal to the display device 100. In addition, the peripheral device include various types of electronic devices capable of transmitting and receiving audio data with the display device 100 and equipped with a microphone.


The microphone 175 may be provided in a part of the display device 100. In particular, the microphone 175 may be formed on the exterior of the display device 100. For example, the microphone 175 may be provided at the bottom of the case surrounding the display 180, but since this is only an example, it is reasonable that the location of the microphone 175 is not limited.


The microphone 175 records surrounding sounds, and the controller 170 can acquire recording data recorded by the microphone 175 and perform user voice recognition from the recording data. That is, the controller 170 can recognize the user voice from the recording data of the microphone 175.


According to an embodiment of the present disclosure, the controller 170 can compensate user voice recognition by using audio data received from the peripheral devices 301, 302, 303 in order to more accurately recognize the user voice. there is.


The controller 170 may perform user voice recognition to obtain a wakeup word and user voice recognition to obtain a command. The wakeup word may be a command that starts voice recognition.


According to one embodiment, the controller 170 can always perform compensation of the user voice recognition when recognizing a user voice.


According to another embodiment, the controller 170 may perform compensation of the user voice recognition when recognizing a user voice after the wakeup word is recognized. Specifically, the controller 170 does not perform compensation of the user voice recognition in the operating state for recognizing the wakeup word, but may perform compensation of the user voice recognition when recognizing the user voice to obtain a user command after recognizing the wakeup word. there is. That is, when the wakeup word is recognized, the controller 170 can compensate user voice recognition using audio data received from the peripheral device.


For example, when a wakeup word such as “Hi, LG” is recognized, the controller 170 performs verification of the wakeup word, and when verification of the wakeup word is completed, performs compensation of the user voice recognition to improve recognition of the command.



FIG. 6 is a flowchart showing an operating method of a display device according to an embodiment of the present disclosure.


The controller 170 can search for peripheral device (S10).


The controller 170 can search for peripheral devices such as Bluetooth-enabled devices, Wi-Fi connected devices, or devices registered to the display device 100.


The controller 170 may transmit a signal requesting to control the microphone to be turned on to the searched peripheral device. The peripheral device can control the microphone to turn on according to the signal received from the display device 100.


The controller 170 can share audio data with the searched peripheral device (S20).


Audio data may include at least one of recording data recorded through a microphone and sound source data being played through a speaker in each of the display device 100 and the peripheral device.


The controller 170 may receive at least one of recording data recorded by the peripheral device and sound source data being played by the peripheral device through the wireless communication interface 173. Additionally, the controller 170 may transmit recording data recorded by the microphone 175 and sound source data being output through the speaker 185 to at least one peripheral device through the wireless communication interface 173.


In this way, the controller 170 can share audio data by transmitting and receiving audio data with a peripheral device.


The controller 170 may perform compensation for voice recognition using audio data (S30).


Next, a method in which a display device according to various embodiments of the present disclosure performs compensation for voice recognition will be described.



FIG. 7 is a flowchart showing a method by which a display device performs compensation for voice recognition according to the first embodiment of the present disclosure.


According to the first embodiment, the display device 100 can optimize voice preprocessing by linking with peripheral devices equipped with microphones through a network and sharing currently recorded data and played data with each other.


In FIG. 7, it is assumed that the display device 100 is the master and the peripheral device is the slave. However, this is only for convenience of explanation, so it is reasonable that it is not limited thereto.


The controller 170 can share audio data with a peripheral device (S110).


In other words, the display device 100 and the peripheral device can share data of the sound source currently being played and recording data recorded from the microphone.


The controller 170 can synchronize the location and delay time with the peripheral device (S120).


The controller 170 can synchronize the location and delay time with the peripheral device based on at least one of sound source data being output by the speaker 185, recording data recorded by the microphone 175, and audio data received from the peripheral device.


For example, the controller 170 can calculate the location of the peripheral device by adjusting the beamforming angle of the microphone 175. More specifically, the controller 170 can recognize the location of the peripheral device by performing recordings while periodically adjusting the beamforming angle of the microphone 175, and analyzing the size of the sound being output by the peripheral device accordingly.


In addition, the controller 170 can calculate the delay time with the peripheral device by analyzing the time when the sound source is output through the speaker 185 and the time when the sound output from the speaker 185 is recorded by the peripheral device.


In other words, the display device 100 and the peripheral device can synchronize the location and delay time by mutually recording and sharing the currently played sound source.


Meanwhile, when there is no sound source being output through the speaker 185 or synchronization is not performed, the controller 170 can perform synchronization of location and delay time by outputting a sample sound corresponding to an inaudible frequency band.


According to one embodiment, the controller 170 can play by periodically adjusting the left and right volumes of the sound source during synchronization. For example, the controller 170 plays sound source while adjusting the left and right volume sizes to 100 left/0 right, 75 left/25 right, 50 left/50 right, 25 left/75 right, and 0 left/right 100 at predetermined cycles. And the peripheral device can record the sound source whose volume is adjusted and transmit it to the display device 100. The controller 170 can obtain the location and delay time of each peripheral device by analyzing data recorded from the peripheral device. If the controller 170 outputs a sample sound corresponding to an inaudible frequency band, the controller 170 may analyze the sound source by extracting only the sound source in the frequency band for analysis from the data recorded in the peripheral device.


Meanwhile, the delay time here refers to the delay time between actual sound and data according to network transmission, and synchronization may be required. For example, the controller 100 transmits sound source data to play the same sound source as the display device 100 in the peripheral device, and performs synchronization of delay time by analyzing the data recorded by the display device 100.


In this way, when the location and delay time are synchronized between the display device 100 and the peripheral device, preprocessing optimization considering the peripheral device can be performed. That is, the controller 100 can obtain preprocessing data considering sound source data played on the peripheral device. The controller 100 can recognize user voice using the obtained preprocessed data.


The controller 100 and the peripheral device can each utilize the currently playing sound source data for pre-processing by transmitting and interoperating with each other.


Meanwhile, the preprocessed data may change to become unsuitable for user voice recognition due to various factors such as movement of peripheral devices or slow communication speed. In this case, the controller 170 may perform an operation to modify or re-acquire the preprocessing data, which will be described below.


The controller 170 can detect whether the peripheral device is moving (S130).


This is because modification of preprocessing data is required if the location of the peripheral device moves.


For example, the peripheral device may be equipped with an acceleration sensor, and the sensing value of the acceleration sensor may be transmitted to the display device 100. In this case, the controller 170 may detect movement of the peripheral device based on a change in the sensing value of an acceleration sensor provided in the peripheral device.


As another example, the controller 170 may detect movement of a peripheral device based on detection of a decrease in preprocessing performance. Specifically, the controller 170 may calculate preprocessing performance at every predetermined cycle and determine whether the calculated preprocessing performance is less than a preset reference value. If the calculated preprocessing performance is less than the preset reference value, the controller 170 may recognize the preprocessing performance as being deteriorated.


If the movement of the peripheral device is not detected, the controller 170 can detect whether the communication speed has decreased (S140).


This is to prevent deterioration of preprocessing performance by immediately switching to the existing preprocessing mode, since if a network speed delay occurs, mutual data may not be transmitted normally.


Meanwhile, the order of the step of detecting whether the peripheral device is moving and the step of detecting a decrease in communication speed may be changed or performed simultaneously.


If no movement of the peripheral device is detected and no decrease in communication speed is detected, the controller 170 may perform preprocessing using audio data from the peripheral device (S150).


In other words, if no movement of the peripheral device is detected and no decrease in communication speed is detected, the controller 170 may perform user voice recognition using preprocessing data obtained through synchronization with the peripheral device.


Meanwhile, when movement of the peripheral device is detected, the controller 170 can recalculate the location of the moved peripheral device, and at this time, the controller 170 can determine whether recognition of the location of the peripheral device is impossible. (S160).


If the controller 170 is capable of recognizing the location of the peripheral device, it can re-share audio data with the peripheral device to synchronize the location and delay time with the peripheral device whose location has changed.


However, if the controller 170 cannot recognize the location of the peripheral device, it can perform preprocessing using its own audio data (S170). Additionally, even when a decrease in communication speed is detected, the controller 170 can perform preprocessing using each audio data (S170).


In other words, if the location of the peripheral device cannot be recognized or the communication speed is slow, the controller 170 may perform preprocessing using only the audio data of the display device 100 without using the audio data of the peripheral device.


This is because the peripheral device, whose location cannot be recognized by the controller 170, is considered to be in a state where there is no interference between microphones because it is located in another space, etc. In addition, in the case of a decrease in communication speed, this is to prevent a decrease in preprocessing performance due to poor transfer of audio data between each other.


In summary, when the controller 170 is not linked to a peripheral device due to the inability of location recognition or normal communication, the controller 170 recognizes the user voice in the data recorded by the microphone 175 using first preprocessing data. And, when linking with a peripheral device through location recognition or normal communication, the controller 170 acquires second preprocessing data based on audio data received from the peripheral device, and recognizes user voice from the data recorded by the microphone 175 by using the second preprocessing data.


Here, first preprocessing data may mean data for recognizing a user voice in the recording data of the microphone 175, considering only the audio data of the display device 100. The second preprocessing data may refer to data for recognizing a user voice from the recording data of the microphone 175 when considering the audio data of the display device 100 and the audio data of at least one peripheral device.


Therefore, when the location of the peripheral device cannot be recognized due to movement of the peripheral device or a decrease in communication speed is detected, the controller 170 recognizes the user voice from the data recorded by the microphone 175 using first preprocessing data. And, when movement of the peripheral device is detected, the controller 170 may receive audio data again from the peripheral device whose movement was detected and modify the second preprocessing data.


As such, according to the first embodiment of the present disclosure, the display device 100 and the peripheral device synchronize their locations and delay times, so that each device can filter the sound output from the other device. Accordingly, the user voice can be more accurately recognized from the data recorded from each microphone. For example, the controller 170 can more accurately recognize the sound output from the peripheral device in the recording data, and thus can accurately recognize the user voice by filtering the sound output from the peripheral device in the recording data.


Additionally, when multiple devices with voice recognition functions are in the same space, the problem of voice misrecognition due to audio being played back to each other can be minimized. In addition, there is an advantage in that preprocessing performance is improved because voices acquired from multiple devices are preprocessed compared to the existing method of preprocessing voices acquired from a single device. In addition, there is an advantage in that the probability of misrecognition and malfunction can be minimized even when there is a lot of noise, such as when other audio is played near a device with a voice recognition function.


Next, a method by which the display device 100 performs compensation of voice recognition according to the second embodiment of the present disclosure will be described.


According to the second embodiment of the present disclosure, the controller 170 may recognize the user voice based on the size of the command recognized in the data recorded from each of the display device 100 and at least one peripheral device.


Specifically, the controller 170 extracts user commands from each of the data recorded by the microphone 175 and the data recorded from the peripheral device, and recognizes the command with the largest size among the extracted user commands as the user voice.


This reflects the fact that the closer the device is to the user, the more likely it is that the user's speech will be recorded most clearly.


Next, a method by which the display device 100 performs compensation of voice recognition according to the third embodiment of the present disclosure will be described.


According to the third embodiment of the present disclosure, the controller 170 extracts user commands from each of the data recorded by the microphone 175 and the data recorded by the peripheral device, and recognizes user commands that match at least two of the extracted user commands as the user's voice.


Alternatively, the controller 170 may extract user commands from each of the data recorded by the microphone 175 and the data recorded by the peripheral device, and recognize the most frequently extracted user command as the user voice.


According to an embodiment of the present disclosure, even if the user is far away from the display device 100 and the microphone 175 records ambient noise louder than the user voice, the recognition content from the device located around the user is supplemented and therefore user voice can be recognized more accurately.


Additionally, as in the embodiment of the present disclosure, beamforming performance can be improved by considering audio data from a peripheral device rather than the display device 100 alone. For example, assuming that one device is equipped with two microphones, if the two devices operate simultaneously, sound sources from various angles can be recorded through four microphones, thereby improving beamforming performance. In other words, since the master device can use not only its own microphone but also the microphone of at least one slave device, an effect of increasing the number of microphones for long-distance voice recognition can be expected. In addition, there is an advantage in that pre-processing performance is improved by linking external sounds that affect voice pre-processing.


Meanwhile, in this specification, it is explained that the display device 100 is the main device that recognizes user voice, and the peripheral device operates to complement user voice recognition in the display device 100, but the opposite may also be true. That is, one peripheral device is the main device that recognizes the user voice, and the remaining peripheral devices, including the display device 100, may operate to supplement user voice recognition.


According to an embodiment of the present disclosure, the above-described method may be implemented with codes readable by a processor on a medium in which a program is recorded. Examples of the medium readable by the processor include a ROM (Read Only Memory), a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.


The display device as described above is not limited to the configuration and method of the above-described embodiments, but the embodiments may be configured by selectively combining all or part of each embodiment such that various modifications can be made.

Claims
  • 1. A display device, comprising: a speaker;a wireless communication interface configured to communicate with a peripheral device;a microphone configured to record surrounding sounds; anda controller configured to perform user voice recognition on data recorded by the microphone,wherein the controller is configured to compensate the user voice recognition using audio data received from the peripheral device.
  • 2. The display device according to claim 1, wherein the audio data includes at least one of recording data recorded by the peripheral device and sound source data being played by the peripheral device.
  • 3. The display device according to claim 1, wherein the controller is configured to: if the display device is not linked with the peripheral device, recognize a user voice from the data recorded by the microphone using first preprocessing data,if the display device is linked with the peripheral device, obtain second preprocessing data based on audio data received from the peripheral device, and recognize the user voice from the data recorded by the microphone using the second preprocessing data.
  • 4. The display device according to claim 3, wherein the controller is configured to modify the second preprocessed data by receiving the audio data again from the peripheral device whose movement was detected, if movement of the peripheral device is detected.
  • 5. The display device according to claim 4, wherein the controller is configured to detect the movement of the peripheral device based on change in a sensing value of an acceleration sensor provided in the peripheral device.
  • 6. The display device according to claim 4, wherein the controller is configured to detect the movement of the peripheral device based on detection of a decrease in preprocessing performance.
  • 7. The display device according to claim 4, wherein the controller is configured to recognize the user voice from the data recorded by the microphone using the first preprocessing data, if location of the peripheral device cannot be recognized due to the movement of the peripheral device.
  • 8. The display device according to claim 4, wherein the controller is configured to recognize the user voice from the data recorded by the microphone using the first preprocessing data, if decrease in communication speed is detected.
  • 9. The display device according to claim 1, wherein the controller is configured to synchronize location and delay time with the peripheral device based on at least one of sound source data being output by the speaker, data recorded by the microphone, and audio data received from the peripheral device.
  • 10. The display device according to claim 9, wherein the controller is configured to synchronize location and delay time by outputting a sample sound corresponding to an inaudible frequency band if there is no sound source being output by the speaker or synchronization is not performed.
  • 11. The display device according to claim 9, wherein the controller is configured to calculating location of the peripheral device by adjusting a beamforming angle of the microphone.
  • 12. The display device according to claim 1, wherein the peripheral device comprises at least one of a remote control device, a mobile terminal, and a Bluetooth speaker that transmits a control signal to the display device.
  • 13. The display device according to claim 1, wherein the controller is configured to extract user commands from each of the data recorded by the microphone and the data recorded by the peripheral device, and recognize a command with the largest size among the extracted user commands as the user voice.
  • 14. The display device according to claim 1, wherein the controller is configured to extract user commands from each of the data recorded by the microphone and the data recorded by the peripheral device, and recognize user commands that match at least two of the extracted user commands as the user's voice.
  • 15. The display device according to claim 1, wherein the controller is configured to compensate the user voice recognition using audio data received from the peripheral device if a wakeup word is recognized.
PCT Information
Filing Document Filing Date Country Kind
PCT/KR2022/001347 1/26/2022 WO