The present invention relates generally to the data processing field, and more particularly, relates to a method, system and computer program product for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems.
Machine learning algorithms usually work on a labeled training data and train on it to get a machine learning model. After training is finished, the model will be used to evaluate on every input test data example and output the results for each.
Business users often organize data in a modular format. For example, for a bank customer, the data could be organized into chit chat, mortgage, investment, and the like. When a bank customer wants to build a machine learning system to direct its client to detailed transaction procedure, a need exists to put all these modular data together as an integration. Therefore, a need exists for a good way to do the data integration and to design a machine learning model to have a better understanding and utilization of the modular structure. However, the common machine learning model lacks a knowledge of other examples from other modules during the training. As a result model predictions and confidences are difficult to compare across different models.
A need exists for an efficient and effective mechanism for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems.
Principal aspects of the present invention are to provide a method, system and computer program product for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems. Other important aspects of the present invention are to provide such method, system and computer program product substantially without negative effects and that overcome many of the disadvantages of prior art arrangements.
In brief, a method, system and computer program product are provided for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems. User intents are identified using separately trained models with corresponding training data. Natural language processing (NLP) and statistical analysis are applied on the training data to classify the training data into groups and modules. A confidence rescaling algorithm is used for combining results from the modules. The dynamic confidence rescaling uses statistical information computed about each module being combined to identify user intents with enhanced accuracies in comparison to baseline models without confidence rescaling.
In accordance with features of the invention, experimental results using real customer data and real conversational intent classification scenarios show enhanced accuracies for user intent recognition when the confidence rescaling algorithm is used.
The present invention together with the above and other objects and advantages may best be understood from the following detailed description of the preferred embodiments of the invention illustrated in the drawings, wherein:
In the following detailed description of embodiments of the invention, reference is made to the accompanying drawings, which illustrate example embodiments by which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In accordance with features of the invention, a method and system are provided for implementing enhanced dynamic confidence rescaling for modularity in automatic user intent detection systems. User intents are identified using separately trained models with corresponding training data. Natural language processing (NLP) and statistical analysis are applied on the training data to classify the training data into groups and modules. A confidence rescaling algorithm is used for combining results from the modules. The dynamic confidence rescaling uses statistical information computed about each module being combined to identify user intents with enhanced accuracies in comparison to baseline models without confidence rescaling.
In general, machine learning based classification usually treats each class as having equal important data. The typical machine learning based classification does not have knowledge how the classes are organized originally or which classes could be related. It is also often observed that the small classes are affected by large classes within the same training set. When data is merged into a larger training set for a higher level intent detection, the machine learning model is often easily affected and the accuracy of intents with less examples is lower as compared to training on their own data.
In accordance with features of the invention, a machine learning model adjusts a final prediction using additional structural information of the classes and maintains enhanced accuracies for most of classes including small classes. A main feature of the invention is that all training data used in the machine learning models is used to train and generate one model. Then adjusting the model prediction output uses structural information generated from multiple modules. The adaptation on model prediction output provides dynamic confidence rescaling using statistical information computed about each module being combined. Through many experiments on real customer data and real conversational intent classification scenarios, with dynamic confidence rescaling used provides improved classification accuracy overall on all modules combined to identify user intents.
Having reference now to the drawings, in
Computer system 102 includes a system memory 106 including an operating system 108, a user intent detection control logic 110 and a confidence rescaling algorithm 111. System memory 106 is a random-access semiconductor memory for storing data, including programs. System memory 106 is comprised of, for example, a dynamic random access memory (DRAM), a synchronous direct random access memory (SDRAM), a current double data rate (DDRx) SDRAM, non-volatile memory, optical storage, and other storage devices.
Computer system 102 includes a storage 112 including a machine learning model 114 and a network interface 116. Computer system 102 includes an I/O interface 118 for transferring data to and from computer system components including CPU 104, memory 106 including the operating system 108, user intent detection system control logic 110, confidence rescaling algorithm 111, storage 112 including machine learning model 114, and network interface 116 and a network 120 and a client system and user input 122.
In accordance with features of the invention, dynamic confidence rescaling for modularity yields substantial gains in intent recognition accuracy over conventional intent detection systems where the intent result is composed based from multiple independent sub-domain intent detection systems.
Referring to
Referring to
Referring to
Then the total size of intents imported is computed as indicated at a block 304. The computed total size of intents imported measures how complex the imported intent domain is since the more intents imported indicates that the domain is more complex. The first two metrics (SA_W) and (ST_W) represent a relative indicator, comparing to the base domain (the module being imported to), because there is a need to compare the intent predictions between these modules. As shown in a block 306, to get the relative number for the metrics (SA_W) and (ST_W), these two metrics on base domain module are computed as well. The two corresponding metric for base domain are SA _P and ST_P. Thus, the two relative metrics are ALPHA=SA_W/SA_P and BETA=ST_W/ST_P. As shown in a block 308, a non-linear function is used to combine these metrics together as a confidence rescaling factor, the function is: X*ALPHA+F(BETA), where F is (1−EXP(−0.5*BETA))/(1+EXP(−0.5*BETA)).
The overall idea is the larger imported intent average size is the larger rescaling factor for base intent module, and the larger imported intent total size the larger rescaling factor for base intent module. In addition, to keep the base intent module stable, the re-scaling factor is a bit aggressive for a base module. This is done to prefer more important user intents over the imported ones.
Experimental results have shown that with dynamic confidence scaling, the accuracies for most intents from both modules have much better performance than simple merging without this technique. Experimental results have shown that the more intents imported, the bigger impact to original base module. Thus, stronger confidence adjusting is needed. In each importing case, experimental results have shown that different scaling factors can be obtained, ranging from 1 to 20. Experimental results have shown that dynamic confidence rescaling provides decent estimate of the rescaling factor and then provides close to optimal accuracies for most intents from both modules.
Referring now to
As indicated at a block 402, a bank chatbot provides multiple training domains including a personal account, an investment, and a mortgage as indicated at respective blocks 404, 406, and 408. At block 404, the personal account provides further multiple training domains including an online account, a credit card, and the like, as indicated at respective blocks 404, 406, and 408.
Referring now to
As indicated at respective blocks 502, 504, and 506, multiple domains include the personal account 502, the investment 504, and the mortgage 506. Confidence rescaling is applied to each domain of the personal account 502, the investment 504, and the mortgage 506. As indicated at respective blocks 508, and 510, multiple testing domains from personal account 502 having confidence rescaling applied include the online account 508, and the credit card 510. Confidence rescaling is applied to each domain of the online account 508, and the credit card 510.
Referring now to
Computer readable program instructions 604, 606, 608, and 610 described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The computer program product 600 may include cloud based software residing as a cloud application, commonly referred to by the acronym (SaaS) Software as a Service. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions 606, 606, 608, and 610 from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
A sequence of program instructions or a logical assembly of one or more interrelated modules defined by the recorded program means 604, 606, 608, and 610, direct the system 100 for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems of the preferred embodiment.
While the present invention has been described with reference to the details of the embodiments of the invention shown in the drawing, these details are not intended to limit the scope of the invention as claimed in the appended claims.