The present disclosure is directed to factory systems, and more specifically, to voice recognition systems for factory floors.
Voice input has become popular due to the development of machine learning technologies. Nowadays, voice input is widely used in consumer use, such as in smart phones.
Voice input provides several benefits, such as ease of input and flexibility. In related art implementations, factory operators have attempted to utilize voice input method for machine operation. If such implementations can be realized, workers on the factory shop floor can easily collaborate with industrial machines and improve productivity.
However, one of the problems that occur on the factory floor is noise. A factory tends to have many machines and these machines cause different types of noise. Such noise degrades the accuracy of the command recognition. Further, machine operation requires high accuracy to prevent unintended operations, which might cause accidents.
There have been approaches to develop a voice recognition program which can apply various environments with different people. Typical machine learning base voice recognition programs involve a voice recognition algorithm and a parameter set which is calculated from a very large data set.
Such related art approaches are divided into an implementation to enhance the voice recognition algorithm, or to prepare and use a large data set including various types of noise and human voices. The first approach requires a lot of time to implementation. The other approach requires a large data set. Furthermore the data set should cover all kinds of factory environments.
Example implementations herein are directed to maintaining high accuracy for voice recognition even in a noisy environment surrounded by manufacturing machines. In example implementations described herein, there is an accuracy improvement method of command recognition via human voice from a system deployment viewpoint. In example implementations described herein, methods and systems are directed to maximize the accuracy of voice recognition in noisy factory shop floor by using appropriate parameter set based on the operator condition.
Aspects of the present disclosure can involve a method, involving executing a user check-in process to determine user identification and location information; applying parameters to a speech recognition algorithm and a denoising algorithm based on the user information and location information; and configuring a process to be controlled through the speech recognition algorithm and the denoising algorithm.
Aspects of the present disclosure can involve a computer program storing instructions for executing a process, the instructions involving executing a user check-in process to determine user identification and location information; applying parameters to a speech recognition algorithm and a denoising algorithm based on the user information and location information; and configuring a process to be controlled through the speech recognition algorithm and the denoising algorithm. The computer program can be stored in a non-transitory computer readable medium and configured to be executed by one or more processors.
Aspects of the present disclosure can involve a system, involving means for executing a user check-in process to determine user identification and location information; means for applying parameters to a speech recognition algorithm and a denoising algorithm based on the user information and location information; and means for configuring a process to be controlled through the speech recognition algorithm and the denoising algorithm.
Aspects of the present disclosure can involve an apparatus configured to control a machine, the apparatus involving a processor, configured to execute a user check-in process to determine user identification and location information; apply parameters to a speech recognition algorithm and a denoising algorithm based on the user information and location information; and configure the machine to be controlled through the speech recognition algorithm and the denoising algorithm.
The following detailed description provides further details of the figures and example implementations of the present application. Reference numerals and descriptions of redundant elements between figures are omitted for clarity. Terms used throughout the description are provided as examples and are not intended to be limiting. For example, the use of the term “automatic” may involve fully automatic or semi-automatic implementations involving user or administrator control over certain aspects of the implementation, depending on the desired implementation of one of ordinary skill in the art practicing implementations of the present application. Selection can be conducted by a user through a user interface or other input means, or can be implemented through a desired algorithm. Example implementations as described herein can be utilized either singularly or in combination and the functionality of the example implementations can be implemented through any means according to the desired implementations.
In an example implementation described below, the system selects an optimal parameter set based on the operator and operator location to keep the accuracy high and to reduce false positives.
Memory 210 may store user management program 211, deploy management program 212, user management table 213, environmental value management table 214 and running process management table 215. The user management program 211 is executed by CPU 201 when a User Identification Request is received. The deploy management program is executed by CPU 201 when a Check-in Request is received. The user management table 213 stores information regarding users who operate the manipulator. The environmental value management table 214 stores configuration data including the parameter set for each user and location. The running process management table 215 stores status of the process running on the signal processing server.
Further details of the elements of management server are described as follows.
If the message type is not a check-in (No) then the request is determined to be associated with a check-out procedure. In which case, the program retrieves the user ID from the request message at 711. At 712, the program searches running process management table and identifies the process ID with the user ID as a search key. At 713, the program conducts a login to the signal processing server to shut down the command detection program. Finally at 714, the program executes a script to stop the command detection program.
Memory 510 has check-in detection program 511 and several command detection programs 512. The check-in detection program 511 is executed when the server starts up. The command detection program 512 is executed by the deploy management program 212. The detail of each element is described as follows.
The first process is to identify the user location. In this example, the program uses a sound source localization technique and identifies the user location at 911. The other process is to identify the user. In this example, the program uses the voice fingerprint to identify the user.
At 921, the program sends the raw data in a user identification request massage to the management server, and acquires the user ID in a response message from the management server at 922. After completing the parallel processes, the program generates a check-in message with location information and user ID at 904. Finally, the program sends the check-in message to the management server at 905.
As illustrated in
At 1004, the command detection program identifies a machine operational command from the speech data, which can be accomplished by utilizing Natural Language Understanding (NLU) algorithms.
At 1005, the command detection program checks for a command. If the command is determined to be a check-out command (Yes), the program sends a check-out request message to the management server at 1006. Otherwise (No), if the command is a command to operate machine, it sends the command to the machine at 1011. At 1007, the program completes the process of the speech data, and loops back to the flow at 1003 if more speech is detected.
In example implementations as shown in
In example implementations, when the user management program 211 provides the user identification response to the check-in detection program 511 and the check-in process determines that the check-in is to be accepted, check-in detection program 511 can provide the check-in request to deploy management program 212, whereupon deploy management program 212 can provide the appropriate parameters for the command detection program 512 to execute a speech recognition algorithm and a denoising algorithm based on the user information and the location information. In an example as illustrated in
In example implementations, a denoising algorithm can be applied along with a speech recognition algorithm by command detection program 512 to ensure clarity of the voice commands received through the acoustic sensor or microphone. For example, the denoising algorithm is configured to adjust the acoustic sensor according to the configuration received (e.g., azimuth, orientation, etc.), and also utilize the trained parameter set to filter out noise for the environment. Similarly, speech recognition algorithm is configured to adopt the parameters provided to recognize speech according to the parameters. In example implementations, the parameters can be generated from a machine learning algorithm configured to provide settings for the speech recognition algorithm and denoising algorithm based on the user and the location. In another example implementation, the trained parameter set and the configuration can be provided so that the speech recognition algorithm and denoising algorithm can be executed based on the machine to be controlled. In such an example implementation, the denoising algorithm can be configured from parameters that were generated from machine learning for when the machine was operating in normal conditions to determine what the underlying noise in the environment of the machine is like. In such an implementation, the denoising algorithm can thereby subtract the noise from the processed audio to filter out environment noise and leave the command audio from the user intact. Further, the speech recognition algorithm can be configured with parameters involving command sets specific to the machine, so that the speech recognition algorithm accuracy can be improved through being configured to only identify the commands associated with the machine to be controlled. Thus, depending on the desired implementation, the parameters can involve a selected microphone device associated with the location information, a beamforming parameter associated with the selected microphone device, and parameters associated with a selected application executing the denoising algorithm set based on the location information. Such parameters can be learned from a machine learning algorithm to generate a trained parameter set associated with the location to configure the speech recognition algorithm or the denoising algorithm in accordance with the desired implementation. The selected application can be a denoising algorithm selected from a plurality of denoising algorithms, and/or a speech recognition algorithm selected from a plurality of speech recognition algorithms.
In example implementations, the deploy management program utilizes the speech recognition algorithm and the denoising algorithm to process commands from speech received through the acoustic sensors or microphones and executes the process accordingly. In an example implementation, the process can be a control process for controlling a machine on the factory floor, wherein the machine executes processes based on the commands recognized by the command detection program 512.
However, the example implementations are not limited to controlling machine processes, and can be extended to other processes in accordance with the desired implementation. For example, the command detection program 512 can also execute a process to provide messages to the management server to check-out of a machine as illustrated in
Thus, through the example implementations described herein, it is possible to maximize the accuracy of command recognition in noisy factory shopfloor by using appropriate parameter set based on the operator condition.
Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In example implementations, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result.
Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” “displaying,” or the like, can include the actions and processes of a computer system or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.
Example implementations may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may include one or more general-purpose computers selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored in a computer readable medium, such as a computer-readable storage medium or a computer-readable signal medium. A computer-readable storage medium may involve tangible mediums such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible or non-transitory media suitable for storing electronic information. A computer readable signal medium may include mediums such as carrier waves. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Computer programs can involve pure software implementations that involve instructions that perform the operations of the desired implementation.
Various general-purpose systems may be used with programs and modules in accordance with the examples herein, or it may prove convenient to construct a more specialized apparatus to perform desired method steps. In addition, the example implementations are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the example implementations as described herein. The instructions of the programming language(s) may be executed by one or more processing devices, e.g., central processing units (CPUs), processors, or controllers.
As is known in the art, the operations described above can be performed by hardware, software, or some combination of software and hardware. Various aspects of the example implementations may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine-readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out implementations of the present application. Further, some example implementations of the present application may be performed solely in hardware, whereas other example implementations may be performed solely in software. Moreover, the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways. When performed by software, the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.
Moreover, other implementations of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the teachings of the present application. Various aspects and/or components of the described example implementations may be used singly or in any combination. It is intended that the specification and example implementations be considered as examples only, with the true scope and spirit of the present application being indicated by the following claims.