A portion of the disclosure of this patent document contains material to which a claim for copyright is made. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but reserves all other copyright rights whatsoever.
Embodiments of the present invention relate to artificial intelligence systems for training classifiers.
There are numerous reasons for classifying entities. Binary classification indicates whether or not an entity is in a particular class. Classification can be done based on the publications of an entity. This can include social media publications. The social media publications are analyzed for the presence of indicators. The indicators might be key words. The presence or absence of an indicator might be digitally stored as a binary value of 1 if said indicator is present and a binary value of 0 if said indicator is not present in a particular publication or group of publications. Prior art systems have assigned different weights to different indicators. This recognizes that some indicators are stronger than others. It has been discovered, however, that when there is a large number of low weight indicators in an entity's publications, prior art systems tend to over predict the probability that an entity is in a particular class. There is need, therefore, for an artificial intelligence system for training a classifier that will not over predict due to large numbers of low weight indicators.
The summary of the invention is provided as a guide to understanding the invention. It does not necessarily describe the most generic embodiment of the invention or the broadest range of alternative embodiments.
A system for training a classifier has a database of training data and a modeling system for building a classification model based on the training data. The database has a binary class for each entity and binary tokens indicating whether or not one or more indicators about the entity are true. The classification model is based on a tempered indication of the tokens. The tempered indication is a ratio of a weighted sum of the tokens for each entity divided by a tempering factor. The tempering factor is a function of the unweighted sum of the tokens for each entity. Thus, the tempering factor will reduce the tempered indication when large numbers of low weight tokens are present so that the model does not over predict the probability of an entity being in a class.
The detailed description describes non-limiting exemplary embodiments. Any individual features may be combined with other features as required by different applications for at least the benefits described herein.
As used herein, the term “about” means plus or minus 10% of a given value unless specifically indicated otherwise.
As used herein, a “computer-based system”, “computer system”, “database” or “engine” comprises an input device for receiving data, an output device for outputting data, a permanent memory for storing data as well as computer code, and a microprocessor for executing computer code. The computer code resident in said permanent memory will physically cause said microprocessor to read-in data via said input device, process said data within said microprocessor, and output said processed data via said output device.
As used herein a “binary value” is any type of computer data that can have two states. Said data may be, but is not limited to, a bit, an integer, a character string, or a floating point number. A binary value of “1” or “true” is interpreted as the number 1 for arithmetic calculations. A binary value of “0” or “false” is interpreted as the number 0 for arithmetic calculations.
As used herein, the symbols “h”, “i” and “j” refer to index numbers for one of a plurality of objects. Thus, the term “entity j” refers to a jth entity in a plurality of said entities. The term “token i” refers to an ith token in a plurality of said tokens.
As used herein, the term “adjudicated class” means a classification that has been done independently, at least in some respect, of the data used to train a classifier. For example, referring to
A token i is a binary indication of the presence or absence of an indicator i in a publication by an entity j. The publication may be a social media publication or any publication under the control of the entity, such as text messages, emails, magazine articles, etc. The indicator may be any aspect of the publication, such as words, phrases, word stems, pictures, videos, font, audio or graphic layout. The token i has a binary value of 1 when said binary indication is true (i.e. the indicator i is present in the publication) and a binary value of 0 when said binary indication is false (i.e. the indicator i is not present in the publication).
The computer implemented modeling engine 120 comprises a microprocessor and computer readable instructions stored on a permanent memory. The computer readable instructions are operable to cause said microprocessor to physically carry out the steps of:
The output 124 may be to an automated classification system that will read in token data for a prospective entity h and use said model for determining a probability of said prospective entity h being in said class. For example, prospective entities might be one or more insurance claimants and their associated insurance claims. The class may be whether or not a claim is fraudulent. Thus, the automated classification system can be used to determine if one or more insurance claims is fraudulent.
The tempering factor has a value of 1 when there is only one indicator found in an entity's data (i.e. the unweighted sum of said tokens i for said entity j has a value of 1). This is set by the offset factor 146.
The formula for the tempered indication shown in
In order to compare the tempered indication to a binary class, the tempered indication may be transformed into a real value between 0 and 1 by a normalized asymptotic transformation. A particular normalized asymptotic transformation 152 is shown in
A particular function 160 is shown for calculating an error function. This function is the unweighted sum of squares (i.e. “SSQ”) of the residuals. Any error function, however, may be used that provides an aggregate measure of how well the model fits the data. An alternative error function might be a weighted sum of squares of the residuals where the weights are related to size or importance of the training entities j relative to each other.
One of the practical applications of the system for training a classifier is fraud detection in insurance claims. An example of an insurance claim is a workers' compensation insurance claim. Each insurance claim has a claimant. An example of a class for a workers' compensation insurance claim is whether or not the claim is fraudulent.
A set of 100 workers' compensation claimants and associated insurance claims (i.e. training entities j) were adjudicated to determine which claims were legitimate and which claims were fraudulent. The social media sites of the claimants were then analyzed to identify the presence or absence of six words indicative of whether or not said claims were fraudulent. These six words are the indicators i. The class of each training entity j was associated with an event date. The event date was the date a workplace accident occurred that lead to the claimant's insurance claim. The dates of the publications used for each training entity j were after each training entity's event date. The adjudicated classes and indicator tokens were then stored in a training database. A modeling engine then read the data in. The indicator weights i and tempering parameters of a tempered indication were then calculated based on the model (i.e. items 130, 150, 160) shown in
The classifier system then calculates a probability of fraud Ph using the tempered indication and the asymptotic transformation 510.
The classifier system then compares 520 the calculated probability of fraud Ph with a threshold probability of fraud Po. If the calculated probability of fraud is greater than or equal to the threshold probability, then the claim is flagged 533 for further adjudication. If the calculated probability of fraud is less than the threshold probability, then the claim is deemed not fraudulent 531.
The flagged claims may be reviewed 535 by an adjudicator 530 to make a final determination of whether or not they are fraudulent 539 or not fraudulent 537. Those that are fraudulent are denied for payment. Those that are not fraudulent are approved for payment.
Because the classifier system calculates actual probabilities of fraud, quantitative estimates of the fraction of false positives 534 and false negatives 532 can be made. The threshold probability can then be adjusted 540 to minimize the total cost of unwittingly paying false negative claim 532 that are actually fraudulent and the cost of adjudicating 530 the flagged claim 533 to determine which ones are not fraudulent.
While the disclosure has been described with reference to one or more different exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the disclosure. In addition, many modifications may be made to adapt to a particular situation without departing from the essential scope or teachings thereof. Therefore, it is intended that the disclosure not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this invention. For example, the methods described herein may be applied to multi-valued or even scalar classes of entities. They can also be extended to tokens that are scalars, such as the number of times a particular indicator is present in a publication, or the degree to which an indicator is present.
Number | Name | Date | Kind |
---|---|---|---|
5745654 | Titan | Apr 1998 | A |
6417801 | van Diggelen | Jul 2002 | B1 |
6937187 | van Diggelen et al. | Aug 2005 | B2 |
7813944 | Luk et al. | Oct 2010 | B1 |
7827045 | Madill, Jr. et al. | Nov 2010 | B2 |
8024280 | Jessus et al. | Sep 2011 | B2 |
8036978 | Saavedra-Lim | Oct 2011 | B1 |
8041597 | Li et al. | Oct 2011 | B2 |
8255244 | Raines et al. | Aug 2012 | B2 |
8280828 | Perronnin et al. | Oct 2012 | B2 |
8370279 | Lin et al. | Feb 2013 | B1 |
8498931 | Abrahams et al. | Jul 2013 | B2 |
8521679 | Churchill et al. | Aug 2013 | B2 |
8533224 | Lin et al. | Sep 2013 | B2 |
8543520 | Diao | Sep 2013 | B2 |
8725660 | Forman et al. | May 2014 | B2 |
8744894 | Christiansen et al. | Jun 2014 | B2 |
8799190 | Stokes et al. | Aug 2014 | B2 |
8805769 | Ritter et al. | Aug 2014 | B2 |
8843422 | Wang et al. | Sep 2014 | B2 |
8954360 | Heidasch et al. | Feb 2015 | B2 |
9015089 | Servi et al. | Apr 2015 | B2 |
9229930 | Sundara et al. | Jan 2016 | B2 |
10192253 | Huet et al. | Jan 2019 | B2 |
20060136273 | Zizzamia et al. | Jun 2006 | A1 |
20060224492 | Pinkava | Oct 2006 | A1 |
20070050215 | Kill et al. | Mar 2007 | A1 |
20070282775 | Tingling | Dec 2007 | A1 |
20080077451 | Anthony et al. | Mar 2008 | A1 |
20080109272 | Sheopuri et al. | May 2008 | A1 |
20090132445 | Rice | May 2009 | A1 |
20090208096 | Schaffer | Aug 2009 | A1 |
20100063852 | Toll | Mar 2010 | A1 |
20100131305 | Collopy et al. | May 2010 | A1 |
20100145734 | Becerra et al. | Jun 2010 | A1 |
20100299161 | Burdick et al. | Nov 2010 | A1 |
20110015948 | Adams et al. | Jan 2011 | A1 |
20130124447 | Badros et al. | May 2013 | A1 |
20130226623 | Diana et al. | Aug 2013 | A1 |
20130311419 | Xing et al. | Nov 2013 | A1 |
20130339220 | Kremen et al. | Dec 2013 | A1 |
20130340082 | Shanley | Dec 2013 | A1 |
20140058763 | Zizzamia et al. | Feb 2014 | A1 |
20140059073 | Zhao et al. | Feb 2014 | A1 |
20140114694 | Vries et al. | Apr 2014 | A1 |
20140129261 | Bothwell et al. | May 2014 | A1 |
20140201126 | Zadeh et al. | Jul 2014 | A1 |
20140297403 | Parsons et al. | Oct 2014 | A1 |
20150032676 | Smith et al. | Jan 2015 | A1 |
20150120631 | Gotarredona et al. | Apr 2015 | A1 |
20150127591 | Gupta et al. | May 2015 | A1 |
20150220862 | De Vries et al. | Aug 2015 | A1 |
20150242749 | Carlton | Aug 2015 | A1 |
20150286930 | Kawanaka et al. | Oct 2015 | A1 |
20180373977 | Carbon et al. | Dec 2018 | A1 |
20200036750 | Bahnsen et al. | Jan 2020 | A1 |
20200320769 | Chen | Oct 2020 | A1 |
Entry |
---|
“Multi-Layer Neural Networks with Sigmoid Function—Deep Learning for Rookies (2)” by Nahua King dated Jun. 27, 2017, https://towardsdatascience.com/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f last viewed Oct. 28, 2020. |
Tata Consultancy Services Limited and Novarica, Big Data and Analytics in Insurance on Aug. 9, 2012. |
Francis Analytics and Actuarial Data Mining, Inc., Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar. |
Hendrix, Leslie; “Elementary Statistics for the Biological and Life Sciences”, course notes University of South Carolina, Spring 2012. |
Roosevelt C. Mosley, Jr., Social Media Analytics: Data Mining Applied to Insurance Twitter Posts, Casualty Actuarial Society E-Forum, Winter 2012—vol. 2. |
SAS Institute Inc., Combating Insurance Claims Fraud/How to Recognize and Reduce Opportunisitc and Organized Claims Fraud/White Paper, 2010. |
Tata Consultancy Services, Fraud Analytics Solution for Insurance, 2013. |
The Claims Spot, 3 Perspectives on the Use of Social Media in the Claims Investigation Process dated Oct. 25, 2010, http://theclaimsspot.com/2010/10/25/3-perspectives-on-the-use-of-social-media-in-the-claims-investigation-process/; last viewed Mar. 10, 2014. |
Wang, Gary C.; Pinnacle Actuarial Resources, Inc., Social Media Analytics/Data Mining Applied to Insurance Twitter Posts dated Apr. 4, 2013; viewed Apr. 16, 2013. |
en.wikipedia.org, Spokeo, last viewed Mar. 10, 2014. |
Kolodny, Lora; Gary Kremen's New Venture, Socigramics, Wants to Make Banking Human Again dated Feb. 24, 2012; http://blogs.wsj.com/venturecapital/2012/02/24/gary-kremens-new-venture-sociogramics-raises-2m-to-make-banking-human-again/?mg=blogs-wsj&url=http%253A%252F%252Fblogs.wsj.com%252Fventurecapital%252F2012%252F02%252F24%252Fgary-kremens-new-ventu; last viewed Mar. 10, 2014. |
Google Search “read in data”, https://www.google.com/?gws_rd=ssl#q=“read+in+data”, last viewed Mar. 11, 2015. |
Curt De Vries, et al., U.S. Appl. No. 61/935,922, “System and Method for Automated Detection of Insurance Fraud” dated Feb. 5, 2014. |
International Risk Management Institute, Inc., Event Risk Insurance Glossary, http://www.irmi.com/online/insurance-glossary/terms/e/event-risk.aspx., last viewed Jul. 24, 2015. |
Stijn Viaene, ScienceDirect European Journal of Operational Research 176 (2007) 565-583, Strategies for detecting fraudulent claims in the automobile insurance industry, www.elsevier.com/locate/ejor, viewed Oct. 20, 2015. |
en.wikipedia.org, Global Positioning System, https://en.wikipedia.org/wiki/Global_Positioning_System, viewed Sep. 26, 2016. |
Doanne et al., “Measuring Skewness: A Forgotten Statistic?”, Journal of Statistics Education vol. 19, No. 2 (2011), p. 1-18; last viewed Dec. 19, 2014. |
Carmel, Lucy; Thelaw.tv., “Social Media's Role in Workers' Comp Claims” dated Feb. 27, 2013; last viewed Dec. 22, 2014. |
Scatter Plot Smoothing, https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lowess.html, last viewed Apr. 7, 2016. |
en.wikipedia.org, Bayesian network, https://en.wikipedia.org/wiki/Bayesian_network, last viewed Mar. 21, 2016. |
en.wikipedia.org, Belief revision, https://en.wikipedia.org/wiki/Belief_revision, lasted viewed Mar. 21, 2016. |
en.wikipedia.org, Local regression, https://en.wikipedia.org/wiki/Local_regression, last viewed Apr. 4, 2016. |
en.wikipedia.org, Monotonic function, https://en.wikipedia.org/wiki/Monotonic_function, last viewed Mar. 21, 2016. |
en.wikipedia.org, Semantic network, https://en.wikipedia.org/wiki/Semantic_network, last viewed Mar. 21, 2016. |
en.wikipedia.org, Logistic regression, https://en.wikipedia.org/wiki/Logistic_regression, last viewed Mar. 28, 2016. |
en.wikipedia.org, Reason maintenance, https://en.wikipedia.org/wiki/Reason_maintenance, last viewed Mar. 21, 2016. |
Viaene et al., European Journal of Operational Research 176 (2007) 565-583; O.R. Applications, Strategies for detecting fraudulent claims in the automobile insurance industry; available online at sciencedirect.com, Available online Nov. 22, 2005. |
Number | Date | Country | |
---|---|---|---|
61950912 | Mar 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15161452 | May 2016 | US |
Child | 15975840 | US | |
Parent | 14321905 | Jul 2014 | US |
Child | 15161452 | US |