This technology relates to applying a computer-implemented a scoring function to a series of variables in a computer system.
When a loan application is presented to a lender, the lender decides whether to fund the loan or not. In modern lending systems, this is done by taking data in the loan application, possibly aggregating it with external data from other data sources, and then applying a scoring function to the data to generate a score. Typically, the lender will fund the loan only if the generated score exceeds a certain threshold. Except in extraordinary cases, a computer program located either locally or remotely performs the scoring operation. Similarly, when a collections agent wishes to prioritize defaulted loans upon which they intend to act, they can apply one or more scoring functions to the defaulted loans to generate a score or scores, and perform one or more collection actions to encourage payment based on a prioritization from the generated score or scores. Similarly, when a marketer wishes to determine how best to execute a marketing campaign through one or more media, the marketer can collect data related to all possible targets of the marketing campaign and rank them by applying a scoring function. This helps to optimize the performance of the campaign for the amount of money spent. All of these applications are homologous: an individual or entity wants to make a decision for one or more business items, so the individual or entity passes information through a scoring function that generates one or more scores. The generated score or scores are then used in making the decision for the one or more business items or prioritize actions.
In general, two groups collaborate to develop these scoring functions, a group of modelers (often referred to as underwriters or other names in different contexts,) and a group of programmers (often referred to as software engineers, developers, or similar names.) The modelers, working in a domain-specific language (DSL) or system such as SAS, SPSS, Stata, R, S-plus, MatLab or others, may build a prototype implementation of the scoring function. This prototype can then be given to the programmers, who reimplement the prototype in a general purpose language (GP language) such as C, FORTRAN, Ruby, or C++, before incorporating the implementation into a larger system that delivers scores.
This method of implementation has a number of drawbacks. First, it can require a long period of time to deploy a scoring function, since the reimplementation process is delicate and difficult.
Additionally, the resulting scoring function is relatively difficult to test because tests typically need to be reimplemented in the GP language.
Additionally, DSL's often incorporate algorithms that make the mathematical operations used in the scoring function more stable or accurate. Most GP languages do not include such algorithms, as they make a minor contribution to the uses for which the GP languages are normally applied. Even in cases where programmers have access to tools that implement special purpose algorithms, it can be essentially impossible to guarantee that in all cases the results in the larger embedded system match the results in the prototype scoring function.
Additionally, scoring functions can return unexpected results for reasonable inputs and thus fail to accurately compute solutions. In such cases a final diagnosis of the nature of the failure and its appropriate fix should fall to the group of modelers. If the prototype has been reimplemented, few if any of members of the modeling team will have the sophistication to diagnose any problems with the reimplemented prototype.
Additionally, systems that solve qualitatively different problems (underwriting versus collections versus marketing) will typically have separate implementations. This introduces additional problems, ranging from a lack of shared context (such as the environment the code was written in, including the tools used to write the code and the members of the team who wrote the code) among different programmers, to a lack of common testing infrastructure, even causing potential political strains within an organization due to replication of roles among different units.
By removing the reimplementation process, the drawbacks listed above can be resolved. If an API between the presentation layer and the scoring layer is suitably designed, introducing or upgrading a scoring function can be trivial, since the scoring layer can be easily replaced (if it is a discrete component of the scoring system) or the scoring system can be stopped and restarted with a new scoring function in place. Modelers can debug the implementation of the scoring function, as they wrote it and are therefore familiar with the implementation language and code. Since the scoring function runs in the DSL itself, the scoring function will continue to use any of the special computational or algorithmic features, above.
The division of the main API into two layers makes the process of building a client for a given system more straightforward and reproducible, particularly since many of the interactions between the kernel and the client level are human readable and can thus be read by non-coders. This exposes most of the internal structure of the client layer in an easy to understand medium, making it easier to share knowledge among and between developers and teams, reducing the cost of building new client layers, and thus reducing costs overall.
Illustrated in the accompanying drawing(s) is at least one of the best mode embodiments of the present technology in such drawing(s):
The following description is not intended to limit the, but rather to enable any person skilled in the art to make and use this technology. Although any methods, materials, and devices similar or equivalent to those described herein can be used in the practice or testing of embodiments, the methods, materials, and devices are now described.
The present technology relates to improved methods and systems for replacement of the above-described monolithic approach of re-implementing a prototype in a different language for different applications. Herein is described a multipartite approach in which prototype code can be directly incorporated into a production system. This multipartite system can incorporate an outer presentation layer that duplicates the behavior of the previous approach and an inner scoring layer built from a prototype originally written by a modeling team. These two layers can communicate with one another through a prescribed application programming interface (API) that can be implemented within a single encapsulated unit of execution (a process), between two applications on a single processing system (using an “inter-process communication system”), or between or among applications running on different processing systems (using a “remote communication system”).
In the described system, the scoring layer can be a composite of two layers, a client interface and a kernel. Qualitatively different applications typically use different client layers, but all client layers can share a single kernel. In this model, a batch scoring system that can be applied naturally to a marketing problem can have a first client layer, while a real-time scoring system that can be naturally applied to loan underwriting can have a second client layer. In an example embodiment of this system, one or more client layers can interact with the kernel layer through a single common interface and the one or more client layers can expose the API described above, for instance by enabling objects to be accessed through different means. In order to provide as much common structure as possible, the client layer and the kernel can communicate through an internal API which consists of a set of down-calls from the client layer to the kernel, a set of callbacks by which the kernel extracts information about any input data from the client layer, and a set of human-readable and human-editable files which describe how the callbacks operate and what the callbacks depend upon.
External data 404 can include data collected in response to the application-specific data request 402. This data can include results from credit bureaus, alternative credit bureaus, government records, similar sources or other sources located outside the system.
Persistent local data 406 can include data that is kept in a local database repository for more than one scoring operation. Persistent local data 406 can include information collected during previous computations of scores, such as a last date of a loan application, or external data that is present for all applications, such as geographical mapping information such as latitude and longitude coordinates.
Data conditioning routine 408 can condition data from 402, 404 and 406 into a form usable by the system. In the example embodiment this can include compiling the data into a form of one or more completed rows 410. In many embodiments, all rows 410 presented to the scoring function 412 must contain the same columns in the same order during for each call—that is, the scoring function 412 can return scoring values 414 for a rectangular table with a constant columnar structure. Scoring function 412 output can also be saved in persistent local data 406 for later usage.
Additional credit bureau data 506 can include data collected by parsers 504 in in order to complete a request received from 502. Data 506 can include results gathered from credit bureaus, alternative credit bureaus, government records, similar sources or other sources located outside the system.
Parsers 504 can then send the compiled data from 502 and 504 to a metadata computation module 508. In an example embodiment, metadata computation module 508 can further gather and receive data from proprietary or third party sources. As shown in the example embodiment, these can include geographic information system 510, regional econometric data 512, regional demographic data 514 or other sources and can include data that is kept in a local database repository for more than one scoring operation and which can be updated according to system requirements or rules or from other, external sources. Geographical information system 510 can provide information about physical locations on the Earth, such as the location of the Empire State Building. This data can then be used to determine an unemployment rate, average salary of persons near that location or other representative information. Regional econometric data 512 can include age profiles or other similar data from regional demographic data 514. In addition, these example forms of data can be merged with further forms of useful data (not shown). For example, data from a cable provider can be merged in order to determine if a particular address has cable access, and, if so, what level of service is provided at the address.
Metadata computation module 508 can include one or more sub-modules operable to calculate one or more metavariables as required for a particular embodiment. Metadata computation module 508 can then store data in one or more computed rows 516 for processing by scoring function 518 in order to generate scores 520. In many embodiments, all rows 516 presented to the scoring function 518 must contain the same columns in the same order during for each call that is, the scoring function 518 can return scores 520 for a rectangular table with a constant columnar structure.
Additional credit bureau data 506 can include data collected by parsers 604 in in order to complete a request received from 602. Data 604 can include results gathered from credit bureaus, alternative credit bureaus, government records, similar sources or other sources located outside the system.
Parsers 604 can then send the compiled data from 602 and 604 to a metadata computation module 608. Metadata computation module 608 can further gather data from geographic information system 610, regional econometric data 612, regional demographic data 614 or other sources and can include data that is kept in a local database repository for more than one scoring operation and which can be updated according to system requirements or rules or from other, external sources. Geographical information system 610 can provide information about physical locations on the Earth, such as the location of the Empire State Building. This data can then be used to determine an unemployment rate, average salary of persons near that location or other representative information. Regional econometric data 612 can include age profiles or other similar data from regional demographic data 614. In addition, these example forms of data can be merged with further forms of useful data (not shown). For example, data from a cable provider can be merged in order to determine if a particular address has cable access, and, if so, what level of service is provided at the address.
Metadata computation module 608 can include one or more sub-modules operable to calculate one or more metavariables as required for a particular embodiment. Metadata computation module 608 can then store data in many completed rows 616 for processing by scoring function 618 in order to generate lead ranks 620. In many embodiments, all rows 616 presented to the scoring function 618 must contain the same columns in the same order during for each call—that is, the scoring function 618 can return scores 620 for a rectangular table with a constant columnar structure. Lead ranks 620 can undergo further processing for categorization or other organization.
Architecture
Mobile applications, mobile devices such as smart phones/tablets, application programming interfaces (APIs), databases, social media platforms including social media profiles or other sharing capabilities, load balancers, web applications, page views, networking devices such as routers, terminals, gateways, network bridges, switches, hubs, repeaters, protocol converters, bridge routers, proxy servers, firewalls, network address translators, multiplexers, network interface controllers, wireless interface controllers, modems, ISDN terminal adapters, line drivers, wireless access points, cables, servers and other equipment and devices as appropriate to implement the method and system are contemplated.
Setup, User and Viewer Interaction
In the current system, the outward-facing or front end portion of the code can be written in one or more general purpose languages often used for building web sites and web services and stored in non-transitory, computer-readable memory. In some example embodiments this language can include “Ruby”. The inner-facing or back end portion of the system, including the scoring function can be written in one or more open source domain-specific languages designed particularly to handle statistical computations. In some example embodiments this language can include “R”. In the example embodiment, the “Ruby” front end and the “R” back end portions of the code run in separate processes on the same computer. They can also communicate with one another across a local socket using a special purpose binary protocol. The front end process can also implement a web service with a “RESTful” API (where REST is Representational state transfer).
General Approach to Scoring Functions
In general, scoring functions of a scoring system (e.g. for interface with scoring system interface 143 of
In many embodiments, since a scoring function is a function, operating as a set of instructions stored in a non-transitory, computer-readable medium and executable by a processor, then the data upon which the scoring function is to be evaluated can be an array of one or more rows containing all of the base columns within the scoring function's domain. In this case, the scoring function returns one value for each row in the matrix. A value may not be a single real number or integer, but can be a compound object, as in a marketing model embodiment, to provide multiple priority scores for each potential target.
This structure is implemented in a mail API (e.g. as implemented in an API 142 of
Metavariables can be derived quantities which can be used to compute scores in example embodiments. In a loan application example, three metavariables examples can include: (1) a latitude and longitude of a point from which an application for a loan is made, (2) a latitude and longitude of a loan applicant's claimed address, and (3) a distance between a point at an application is submitted and a loan applicant's claimed home address. Although each of these three are examples of metavariables, the third example is not computable directly from a loan applicant's loan application data. Rather, it must be computed indirectly using the first two metavariable examples. Thus, metavariables can be independent of one another, as the first two examples show, meaning that it does not depend on other metavariables. Alternately and additionally, metavariables can be dependent on other metavariables, as the third example shows, where the third example metavariable is dependent on the first two example metavariables.
The process of parsing the data, using a processor, into a form to which outside data can be joined and from which metavariables can be computed is configured as a part of the client layer. In some embodiments, this configuration can be implemented using a text file containing a Universal Parsing Definition (UPD) object represented in JSON, a standard textual format for storing or exchanging structured data. A Universal Metadata Definition (UMD) object, also represented as a JSON object, can describe how the metavariables are computed from the resulting parsed objects. A black-list, also represented by a JSON object, can then describe which variables are obscured before running the scoring function.
Data can be passed through the API layer of the Standard API 720 and can be collected with other external data requested from or sent by the external client 710 to the Standard API 720 and any persistent data stored in a local or otherwise coupled system database or databases and parsed in step 702 using a set of parsing functions according to a set of protocols described by a Universal Parsing Definition (UPD) file 702. A further example embodiment of step 702 is described below with respect to and shown in
The parsed data 730 can then be transformed into enriched data 740 with a set of high value signals according to a set of recipes, functions, or other instruction sets defined in a Universal Metadata Definition (UMD) file in step 703. These signals can exist in the form of variables or meta-variables in some embodiments. A further example embodiment of step 703 is described below with respect to and shown in
The enriched data 740 can then be filtered through one or more black lists, if applicable, yielding a filtered signal set of data 750 consisting of a named list of vectors. If no black lists are applicable then step 704 can be skipped in some embodiments. In many embodiments it can be essential that the list of vectors includes vectors with compatible lengths either a length of 1 or a constant of a length greater than one.
The list of vectors in the filtered signal set 750 can then be assembled into a table, for example a rectangular table, in step 705. A further example embodiment of steps 704 and 705 is described below with respect to and shown in
The row creation 760 can then be returned to the external client 710 for further processing in step 706. The external client 701 can then view, edit, manipulate or otherwise use the data as required. In the example embodiment, this can include sending the data through one or more scoring functions 770 (locally or remotely) and then receiving results in step 708.
UPD Objects
A universal parser definition (UPD) object as shown in the example embodiment starting at 820 and ending at 821 can be a list of parsing transformations that takes raw data fields (e.g. 812) and parses each into another list of lists of elements (e.g. 814) that can then be used as input to a set of metavariables. Each UPD object can include a list of two string fields: a name field 830 and a parser field 831. When a named element (e.g. 812) is passed into a function, such as a build.compound.row function, a value of the named element can be parsed using a function named by the parser field 831. A value of the parser function can be a named list with a length greater than or equal to a list containing the results of parsing the named object using the parser field 831.
In an example embodiment, if a named element 830 is an XML record or a JSON object 812 then a returned list might include individual values parsed out of the XML record or JSON object 812. In embodiments where the list is a list of JSON objects 812, then a returned list can be expected to contain a list of items parsed out of each individual JSON object 812 where, in turn, each sublist is the same length.
In an example embodiment of the client API, UPD objects can be defined in JSON (Javascript Object Notation). Alternatively, UPD objects can be defined in an XML file with a known schema, as a comma-separated value (CSV) file, or in an opaque format such as might arise from serializing an object and storing it.
UMD Objects
A universal metadata definition (UMD) object can be a list of transformations that compute HVS metadata from other data variables and, in some embodiments, from other HVS metadata. Each of these UMD objects can contain three parts: 1) a string naming a transform 950 that creates one or more HVS metavariable names and two lists of strings; 2) “dependencies” 952 of the transform; and 3) “siblings” 954 computed by the transform. So, in the example embodiment shown in
As described above, a transform (e.g. 950) can be a function that uses two or more variables and returns at least one computed metavariable. The list of dependencies (e.g. 952) and the list of siblings (e.g. 954) can be related only by the fact that the list of dependencies are used to construct the list of siblings through the specified transformation; the two lists need not be the same length nor need the elements of the two lists match up in any way in some embodiments. A transform member can be the name of a function that performs the transform. The function to which a given transform name is associated can be looked up by name when a UMD file is loaded. This operation can be accomplished in languages which support reflection, such as Java, R, S-plus or Python. Additionally, this operation can be accomplished in languages which support dynamic loading by name, such a C or C++ or in any language with access to a DLL's symbol table in Windows environments.
As described above, values of one metavariable can depend on the values of other metavariables. For instance, the distance between two addresses can typically be computed from the geodetic distance between their latitudes and longitudes, but the addresses themselves would not usually be presented in that form. As an example, a human may know that the Taj Mahal is a building in India but would likely be unaware that the Taj Mahal is located at 27.175015 North, 78.042155 East. Typically, the latitude and longitude (as initial metavariables) corresponding to a given address can be first computed using a geographical information system before geodetic distances between points can be computed from those initial metavariables.
In an example embodiment of the client API, UMD objects can be defined in JSON (Javascript Object Notation). Alternatively, a UMD object can be defined in an XML file with a known schema, or in an opaque format such as might arise from serializing an object and storing it.
Blacklisted Variables
To elaborate, in the example embodiment, some or all generated HVS signals 1002 (e.g. in the form of enriched data 740 of
Some elements may not eligible for use in implementing the system and methods described herein. For instance, in some embodiments, scoring functions may be required to depend only upon variables available at a particular scoring time or set of scoring times. In many embodiments there are variables that are available during a training time in which functions are being trained which are computable during the training but which are not available during the scoring time. In one example embodiment, loan performance of a previously given loan can be used to train a function. For reproducibility, any values associated with such variables from a previously given loan may need to be erased at the scoring time of a new loan processing. In addition, there can be variables which are available in implementation of a function but which may not be used for reasons such as privacy protection, legal reasons such as non-discrimination, or data source usage restriction. To guarantee that these variables are not available during production calls in the form of actual loan processing, each client layer can contain one or more blacklists of variables which can be removed, blocked or otherwise not used at production time.
Data Flow in this Notional System
In an example embodiment, an external customer or other third party can call an outer layer of an API through a specified external endpoint, either through a web service as in a real time client accessing the system over a network or by instantiating one or more files in a jointly-accessible repository or other networked database. In the first step of a subsequent process, the reference implementation can consume or otherwise process that data and produce a set of one or more rows of assembled data to be scored by an associated scoring function. In a second step, this a set of one or more rows of assembled data is passed on to a scoring function, which can return one or more scores for each row to the external customer or other third party, either directly as at least one file through a web service or by saving one or more result files in a jointly-accessible repository or other networked database.
Other Aspects of the Technology Horizontal Scaling
Many domain specific computer programming languages can be slow. R, for instance, can be as much as two orders of magnitude slower than a general purpose language such as C or C++ when performing tasks such as string manipulations or list constructions. Two broad ways to work around or otherwise avoid these time delays include re-implementing portions of an API, scoring functions or both in a general purpose language or by scaling out the API, scoring function calls or both horizontally and then running or otherwise processing API, scoring function calls or both with few interactions between multiple machines or modules simultaneously. In some embodiments, as a rule, a re-implementing process can be implemented for the API kernel code. Since the API kernel code is frequently called and is typically managed by a single team of developers in an organization, it can be worth optimizing. Because scoring functions themselves or outer layers of API clients can be typically maintained by modelers or other data scientists, there may be little or no benefit to optimizing the code of the scoring functions themselves or outer layers of API clients. This can also result in unnecessarily high costs for their optimization. Functions in an outer layer of the API implementation are typically small and often vectorized. DSLs for mathematical or statistical operations are typically highly optimized for individual implementations within a system. As such, rewriting parser functions or metadata generator functions may provide a minimal speed-up during production. Since client layers should have a common structure across many clients in order to optimize sharing production and test code and since scoring functions typically need to readable by relatively unsophisticated coder such as modelers or other data scientists, rewriting either of them can be wasted effort.
In some embodiments, horizontally scaling a system by replicating an API across many machines in parallel is a desirable approach to meeting throughput and latency requirements if they are not met when running the system on a single processor. In various embodiments, this form of horizontal scaling can be performed in many different ways. In some embodiments this can include sharing outer API layers and distributing the calls to the API from the outer API layers to scoring layers. In some embodiments this can include replicating entire stacks across many machines and load balancing at the machine level. In either case, the result of load balancing can be a significant increase in the throughput of data in the API and can result in reduction in data latency.
Blue Lists
Latency can increase when extra, lengthy or unnecessary computations of metavariables occurs in various implementations or embodiments. Metavariables can be computed automatically for scoring functions which may eventually use them in addition to preserving scoring function code for the metavariables across multiple scoring functions. However, some scoring functions may require only a small subset of metavariables and by restricting computations performed on various variables to those necessary for a particular scoring function, the latency of each call for a variable can be reduced.
Various embodiments can support “blue lists.” “Blue lists” can be lists of all variables actually used or otherwise processed by the scoring function during a calculation, regardless of whether they are externally or internally computed. Associated kernels may only require the scheduling of computations of variables that are actually used or otherwise processed by the scoring function based on one or more particular “blue lists,” thus reducing the number of metavariables actually computed. An example embodiment can use a particular set of UMD, UPD, and black list files along with the stored code that defines the parser and metadata construction code.
As shown in the example embodiment in
In an example embodiment, adding one or more blue lists can reduce the latency of each scoring call from almost three seconds to approximately 100 msec.
Statelessness
There are a number of scoring function applications that may require statefull behavior, in which a final score of an array is computed by several back-and-forth exchanges between a client and one or more scoring functions. An example is multi-pass scoring. In multi-pass scoring the client can call a function such as build.compound.row, as described elsewhere herein, and initially score the result using a first scoring function. Depending on the initial score from the first scoring function, the client can call build.compound.row again and create a second score result using a second scoring function. To the extent that the two calls to build.compound.row share common parse results or common metavariables, unnecessarily repetitions in processing can be avoided in the second build.compound.row call by storing the results of the first call. In an example embodiment, an applicant borrow may apply for a loan. In embodiments where the loan application is denied, an underwriting system may need to generate an “adverse action letter” for the denied applicant borrower which details the reasons why their application was denied. In order to reduce a cost of proxy computation, it can be helpful to delay processing of the reasons for inclusion in the adverse action letter to a time after the application itself is scored. This is due to the temporal cost of proxy computations involved in generating the reasons for inclusion in the adverse action letter. Delaying determination processing requires that the system be able to return the result of that determination with the portion of the system which initially requested that determination. In some embodiments of this system, this return is performed by invoking a ‘callback function’ which can be passed to the determination processing. A callback function can be a function which is expected to be called when an asynchronous processing step such as the determination processing step completes its task. In some embodiments of this system, the content of the object passed to the callback function can be the actual result of the computation, such as the text of an adverse action letter. In other embodiments, the determination process can pass a reference to the desired output, such as a file name or a database key, to the callback function. In yet other embodiments, the determination system can invoke the callback function with only the data required to construct the final result, as when a determination system computes only codes for the reasons to be reported in the adverse action letter, leaving the actual construction to the callback function.
Data allowing the determination of the content of the callback function must be available to the determination process. In some embodiments, this data can constitute an encapsulation of the function itself, a mechanism by which the callback can be recovered from memory, or a pointer to a persistent resource within which the result of the determination system can be stored. In other embodiments, there may be no explicit callback function, but rather a second subsystem which is only invoked when the determination system stores its results in a known location which was provided to the determination system when it was invoked.
These data items: the nature of the content function, a key or location within which the results are to be stored, or the like, is referred to as persistent state. Systems which retain such information are referred to as stateful. They are said to exhibit statefullness.
As shown in
As shown in
As shown in
Once selectable variables 322 have been chosen by a user, the user can select a generate report button 324. This will cause the program to apply a particular scoring function as described herein and to generate a report. In the example embodiment this can include outputting a report for the user determining the applicant's reliability or likelihood of paying back a loan and whether the applicant's request should be granted.
As shown in an example embodiment in
Any of the above-described processes and methods may be implemented by any now or hereafter known computing device. For example, the methods may be implemented in such a device via computer-readable instructions embodied in a computer-readable medium such as a computer memory, computer storage device or carrier signal. Similarly, storage, storing, or other necessary functions such as processing can be implemented by operative devices including processors and non-transitory computer-readable media.
The preceding described embodiments of the technology are provided as illustrations and descriptions. They are not intended to limit the invention to precise form described. In particular, it is contemplated that functional implementations of the technology described herein may be implemented equivalently in hardware, software, firmware, and/or other available functional components or building blocks, and that networks may be wired, wireless, or a combination of wired and wireless. Other variations and embodiments are possible in light of above teachings, and it is thus intended that the scope of invention not be limited by this Detailed Description, but rather by Claims following.
This application is a continuation of, and claims priority under 35 U.S.C. § 120 to, U.S. application Ser. No. 17/223,698, filed Apr. 6, 2021, which is a continuation of U.S. patent application Ser. No. 16/157,960, now U.S. Pat. No. 11,010,339, filed Oct. 11, 2018, which is a continuation of U.S. application Ser. No. 14/886,926, now U.S. Pat. No. 10,127,240, filed Oct. 19, 2015, which claims priority to U.S. Provisional Application No. 62/065,445, filed Oct. 17, 2014, the entire contents of which are fully incorporated herein by reference. This application also relates to U.S. application Ser. No. 13/454,970, entitled “SYSTEM AND METHOD FOR PROVIDING CREDIT TO UNDERSERVED BORROWERS” filed Apr. 24, 2012; U.S. application Ser. No. 14/169,400 entitled “METHODS AND SYSTEMS FOR AUTOMATICALLY GENERATING HIGH QUALITY ADVERSE ACTION NOTIFICATIONS” filed Jan. 31, 2014; U.S. application Ser. No. 14/276,632 entitled “SYSTEM AND METHOD FOR BUILDING AND VALIDATING A CREDIT SCORING FUNCTION” filed May 13, 2014; U.S. application Ser. No. 13/622,260 entitled “SYSTEM AND METHOD FOR BUILDING AND VALIDATING A CREDIT SCORING FUNCTION” filed Sep. 18, 2012 and U.S. Provisional Application 62/187,748 entitled “METHODS AND SYSTEMS FOR DATA COERCION” filed Jul. 1, 2015 which applications are also hereby incorporated in their entirety by reference.
Number | Name | Date | Kind |
---|---|---|---|
525413 | Gates | Sep 1894 | A |
5745654 | Titan | Apr 1998 | A |
5999938 | Bliss et al. | Dec 1999 | A |
6034314 | Koike | Mar 2000 | A |
6877656 | Jaros et al. | Apr 2005 | B1 |
7035811 | Gorenstein | Apr 2006 | B2 |
7280980 | Hoadley et al. | Oct 2007 | B1 |
7467116 | Wang | Dec 2008 | B2 |
7499919 | Meyerzon et al. | Mar 2009 | B2 |
7542993 | Satterfield et al. | Jun 2009 | B2 |
7610257 | Abrahams | Oct 2009 | B1 |
7711635 | Steele et al. | May 2010 | B2 |
7765151 | Williams et al. | Jul 2010 | B1 |
7813945 | Bonissone | Oct 2010 | B2 |
7873535 | Umblijs et al. | Jan 2011 | B2 |
7873570 | Cagan et al. | Jan 2011 | B2 |
7921359 | Friebel | Apr 2011 | B2 |
7941363 | Tanaka et al. | May 2011 | B2 |
7941425 | Sahu et al. | May 2011 | B2 |
7970676 | Feinstein | Jun 2011 | B2 |
7987177 | Beyer et al. | Jul 2011 | B2 |
7996392 | Liao et al. | Aug 2011 | B2 |
8078524 | Crawford et al. | Dec 2011 | B2 |
8086523 | Palmer | Dec 2011 | B1 |
8166000 | Labrie et al. | Apr 2012 | B2 |
8200511 | Zizzamia et al. | Jun 2012 | B2 |
8219500 | Galbreath et al. | Jul 2012 | B2 |
8266050 | Chheda | Sep 2012 | B2 |
8280805 | Abrahams | Oct 2012 | B1 |
8335741 | Kornegay et al. | Dec 2012 | B2 |
8442886 | Haggerty | May 2013 | B1 |
8442888 | Hansen | May 2013 | B2 |
8447667 | Dinamani | May 2013 | B1 |
8452699 | Crooks | May 2013 | B2 |
8515842 | Papadimitriou | Aug 2013 | B2 |
8554756 | Gemmell et al. | Oct 2013 | B2 |
8560436 | Lau et al. | Oct 2013 | B2 |
8600966 | Kravcik | Dec 2013 | B2 |
8626645 | Lazerson | Jan 2014 | B1 |
8645417 | Groeneveld et al. | Feb 2014 | B2 |
8660943 | Chirehdast | Feb 2014 | B1 |
8694401 | Stewart | Apr 2014 | B2 |
8744946 | Shelton | Jun 2014 | B2 |
8799150 | Annappindi | Aug 2014 | B2 |
8908968 | Spoor | Dec 2014 | B1 |
9047392 | Wilkes et al. | Jun 2015 | B2 |
9189789 | Hastings | Nov 2015 | B1 |
9268850 | El-Charif et al. | Feb 2016 | B2 |
9317864 | Rong | Apr 2016 | B2 |
9355246 | Wan | May 2016 | B1 |
9405835 | Wheeler et al. | Aug 2016 | B2 |
9501749 | Ilya | Nov 2016 | B1 |
9639805 | Feller | May 2017 | B1 |
9686863 | Chung et al. | Jun 2017 | B2 |
10121115 | Chrapko | Nov 2018 | B2 |
10170013 | Roberts | Jan 2019 | B1 |
10581887 | Dinerstein | Mar 2020 | B1 |
10637915 | Ward | Apr 2020 | B1 |
10684598 | Alanqar | Jun 2020 | B1 |
10719301 | Dasgupta | Jul 2020 | B1 |
10824959 | Chatterjee | Nov 2020 | B1 |
10977558 | Herbster | Apr 2021 | B2 |
11296971 | Jain | Apr 2022 | B1 |
20010014833 | Brault | Aug 2001 | A1 |
20020038277 | Yuan | Mar 2002 | A1 |
20020091650 | Ellis | Jul 2002 | A1 |
20020138414 | Pitman | Sep 2002 | A1 |
20020178113 | Clifford et al. | Nov 2002 | A1 |
20030009369 | Gorenstein | Jan 2003 | A1 |
20030033242 | Lynch | Feb 2003 | A1 |
20030033587 | Ferguson | Feb 2003 | A1 |
20030046223 | Crawford | Mar 2003 | A1 |
20030101080 | Zizzamia et al. | May 2003 | A1 |
20030147558 | Loui | Aug 2003 | A1 |
20030176931 | Pednault et al. | Sep 2003 | A1 |
20040068509 | Garden et al. | Apr 2004 | A1 |
20040107161 | Tanaka et al. | Jun 2004 | A1 |
20040199456 | Flint et al. | Oct 2004 | A1 |
20050055296 | Hattersley et al. | Mar 2005 | A1 |
20050114279 | Scarborough | May 2005 | A1 |
20050234762 | Pinto | Oct 2005 | A1 |
20050278246 | Friedman et al. | Dec 2005 | A1 |
20060047613 | Labreuche | Mar 2006 | A1 |
20060083214 | Grim, III | Apr 2006 | A1 |
20060106570 | Feldman | May 2006 | A1 |
20060112039 | Wang | May 2006 | A1 |
20060167654 | Keinan | Jul 2006 | A1 |
20060200396 | Satterfield et al. | Sep 2006 | A1 |
20060218067 | Steele | Sep 2006 | A1 |
20070005313 | Sevastyanov | Jan 2007 | A1 |
20070011175 | Langseth | Jan 2007 | A1 |
20070016542 | Rosauer et al. | Jan 2007 | A1 |
20070050286 | Abrahams | Mar 2007 | A1 |
20070055619 | Abrahams | Mar 2007 | A1 |
20070067284 | Meyerzon et al. | Mar 2007 | A1 |
20070106550 | Umblijs et al. | May 2007 | A1 |
20070112668 | Celano et al. | May 2007 | A1 |
20070124236 | Grichnik et al. | May 2007 | A1 |
20070288338 | Hoadley | Dec 2007 | A1 |
20080133402 | Kurian et al. | Jun 2008 | A1 |
20080133515 | Chien | Jun 2008 | A1 |
20080208820 | Usey et al. | Aug 2008 | A1 |
20080222061 | Soetjahja | Sep 2008 | A1 |
20080306893 | Saidi | Dec 2008 | A1 |
20080307006 | Lee | Dec 2008 | A1 |
20090006283 | Labrie et al. | Jan 2009 | A1 |
20090006356 | Liao et al. | Jan 2009 | A1 |
20090015433 | James et al. | Jan 2009 | A1 |
20090024517 | Crooks | Jan 2009 | A1 |
20090030888 | Sahu et al. | Jan 2009 | A1 |
20090037308 | Feinstein | Feb 2009 | A1 |
20090192980 | Beyer et al. | Jul 2009 | A1 |
20090216748 | Kravcik | Aug 2009 | A1 |
20090254572 | Redlich | Oct 2009 | A1 |
20090299911 | Abrahams | Dec 2009 | A1 |
20090319521 | Groeneveld et al. | Dec 2009 | A1 |
20100005018 | Tidwell | Jan 2010 | A1 |
20100010878 | Pinto et al. | Jan 2010 | A1 |
20100010935 | Shelton | Jan 2010 | A1 |
20100082476 | Bowman | Apr 2010 | A1 |
20100257459 | Galbreath et al. | Oct 2010 | A1 |
20100325067 | Cagan et al. | Dec 2010 | A1 |
20110071969 | Doctor | Mar 2011 | A1 |
20110078073 | Annappindi | Mar 2011 | A1 |
20110112957 | Ingram et al. | May 2011 | A1 |
20110161263 | Lee | Jun 2011 | A1 |
20110173116 | Yan et al. | Jul 2011 | A1 |
20110178902 | Imrey | Jul 2011 | A1 |
20110184941 | El-Charif | Jul 2011 | A1 |
20110320423 | Gemmell et al. | Dec 2011 | A1 |
20120053951 | Kowalchuk et al. | Mar 2012 | A1 |
20120059819 | Wheeler et al. | Mar 2012 | A1 |
20120066106 | Papadimitriou | Mar 2012 | A1 |
20120066116 | Kornegay et al. | Mar 2012 | A1 |
20120066176 | Martignoni | Mar 2012 | A1 |
20120072029 | Persaud et al. | Mar 2012 | A1 |
20120082476 | Ito et al. | Apr 2012 | A1 |
20120159311 | Hanssen | Jun 2012 | A1 |
20120239613 | Danciu et al. | Sep 2012 | A1 |
20130091050 | Merrill | Apr 2013 | A1 |
20130103569 | Gopinathan et al. | Apr 2013 | A1 |
20130138553 | Nikankin et al. | May 2013 | A1 |
20130185189 | Stewart | Jul 2013 | A1 |
20130246944 | Pandiyan | Sep 2013 | A1 |
20130263121 | Franke | Oct 2013 | A1 |
20130339519 | Lientz | Dec 2013 | A1 |
20140012794 | Dillon | Jan 2014 | A1 |
20140014047 | Garcia et al. | Jan 2014 | A1 |
20140025872 | Flynn | Jan 2014 | A1 |
20140032391 | Kapur | Jan 2014 | A1 |
20140052604 | Stewart | Feb 2014 | A9 |
20140081832 | Merrill et al. | Mar 2014 | A1 |
20140108665 | Arora | Apr 2014 | A1 |
20140122355 | Hardtke et al. | May 2014 | A1 |
20140149177 | Frank | May 2014 | A1 |
20140172886 | Wilkes et al. | Jun 2014 | A1 |
20140180790 | Boal | Jun 2014 | A1 |
20140181267 | Wadkins | Jun 2014 | A1 |
20140289098 | Walzak | Sep 2014 | A1 |
20140310661 | Frederickson | Oct 2014 | A1 |
20140310681 | Poozhiyil | Oct 2014 | A1 |
20150019291 | Gershenson | Jan 2015 | A1 |
20150019912 | Darling | Jan 2015 | A1 |
20150056229 | Nandy et al. | Feb 2015 | A1 |
20150081602 | Talley | Mar 2015 | A1 |
20150161098 | Granshaw | Jun 2015 | A1 |
20150213361 | Gamon | Jul 2015 | A1 |
20150213449 | Morrison | Jul 2015 | A1 |
20150254767 | Vargas | Sep 2015 | A1 |
20150254783 | Levin | Sep 2015 | A1 |
20150278941 | Hegarty | Oct 2015 | A1 |
20150317337 | Edgar | Nov 2015 | A1 |
20150347485 | Cai | Dec 2015 | A1 |
20150379428 | Dirac | Dec 2015 | A1 |
20160012211 | Scapa | Jan 2016 | A1 |
20160042292 | Caplan | Feb 2016 | A1 |
20160088723 | Chung et al. | Mar 2016 | A1 |
20160110353 | Merrill | Apr 2016 | A1 |
20160132787 | Drevo | May 2016 | A1 |
20160147453 | Baptist | May 2016 | A1 |
20160300252 | Frank | Oct 2016 | A1 |
20160371238 | Heavenrich | Dec 2016 | A1 |
20170061326 | Talathi | Mar 2017 | A1 |
20170109657 | Marcu | Apr 2017 | A1 |
20170124464 | Crabtree | May 2017 | A1 |
20170140518 | Liang | May 2017 | A1 |
20170220633 | Porath | Aug 2017 | A1 |
20170222960 | Agarwal | Aug 2017 | A1 |
20170316311 | Pilly | Nov 2017 | A1 |
20170330058 | Silberman | Nov 2017 | A1 |
20180018578 | Yoshizumi | Jan 2018 | A1 |
20180025273 | Jordan | Jan 2018 | A1 |
20180060738 | Achin | Mar 2018 | A1 |
20180068219 | Turner | Mar 2018 | A1 |
20180268262 | Osada | Sep 2018 | A1 |
20180293712 | Vogels | Oct 2018 | A1 |
20180322406 | Merrill | Nov 2018 | A1 |
20180349986 | Fidanza | Dec 2018 | A1 |
20190042887 | Nguyen | Feb 2019 | A1 |
20190043070 | Merrill | Feb 2019 | A1 |
20190114704 | Way | Apr 2019 | A1 |
20190228006 | Tormasov | Jul 2019 | A1 |
20190244122 | Li | Aug 2019 | A1 |
20190279111 | Merrill | Sep 2019 | A1 |
20190287025 | Perez | Sep 2019 | A1 |
20190303404 | Amer | Oct 2019 | A1 |
20190311298 | Kopp | Oct 2019 | A1 |
20190318202 | Zhao | Oct 2019 | A1 |
20190318421 | Lyonnet | Oct 2019 | A1 |
20190325514 | Hong | Oct 2019 | A1 |
20190340518 | Merrill | Nov 2019 | A1 |
20190340684 | Belanger | Nov 2019 | A1 |
20190354806 | Chhabra | Nov 2019 | A1 |
20190354853 | Zoldi | Nov 2019 | A1 |
20190378210 | Merrill | Dec 2019 | A1 |
20200005136 | Spryn | Jan 2020 | A1 |
20200012917 | Pham | Jan 2020 | A1 |
20200082299 | Vasconcelos | Mar 2020 | A1 |
20200160177 | Durand | May 2020 | A1 |
20200175586 | McKenna | Jun 2020 | A1 |
20200183047 | Denli | Jun 2020 | A1 |
20200231466 | Lu | Jul 2020 | A1 |
20200242492 | Goel | Jul 2020 | A1 |
20200257927 | Nomi | Aug 2020 | A1 |
20200257961 | Hua | Aug 2020 | A1 |
20210019603 | Friedman | Jan 2021 | A1 |
20210133631 | Prendki | May 2021 | A1 |
20210209688 | Krishnamurthy | Jul 2021 | A1 |
20210224605 | Zhang | Jul 2021 | A1 |
20210256392 | Zhengzhang | Aug 2021 | A1 |
20210281491 | Yelahanka Raghuprasad | Sep 2021 | A1 |
20210406815 | Mimassi | Dec 2021 | A1 |
20220019741 | Roy | Jan 2022 | A1 |
20220122171 | Hubard | Apr 2022 | A1 |
20220188519 | Briody | Jun 2022 | A1 |
20220188568 | Singh | Jun 2022 | A1 |
20220191332 | Ahmadi | Jun 2022 | A1 |
Number | Date | Country |
---|---|---|
2014014047 | Jan 2014 | WO |
2014055238 | Apr 2014 | WO |
2014121019 | Aug 2014 | WO |
2014184381 | Nov 2014 | WO |
2015056229 | Apr 2015 | WO |
2015081160 | Jun 2015 | WO |
2019028179 | Feb 2019 | WO |
Entry |
---|
Dong Yue et al, “Threaded ensembles of autoencoders for stream learning : Neural Networks for Stream Learning”, Computational Intelligence, vol. 34, No. 1, doi:10.1111/coin.12146, ISSN 0824-7935, (Feb. 1, 2018), pp. 261-281, URL: https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1111%2Fcoin.12146, XP055925516 (Absract). |
European Extended Search Report issued in EP19796824.1, dated Jun. 13, 2022, 9 pages. |
International Preliminary Report on Patentability in International Appln. No. PCT/US2019/029148, dated Nov. 19, 2020, 6 pages. |
Li, Hongxiang, et al. “A novel method for credit scoring based on feature transformation and ensemble model.” PeerJ Computer Science 7 (2021): e579. 19 pages. |
Office Action (Final Rejection) dated Dec. 7, 2022 for U.S. Appl. No. 16/688,789 (pp. 1-24). |
Office Action (Non-Final Rejection) dated Oct. 28, 2022 for U.S. Appl. No. 17/389,789 (pp. 1-19). |
Office Action (Non-Final Rejection) dated Dec. 8, 2022 for U.S. Appl. No. 17/147,025 (pp. 1-20). |
Wei Min et al, “Behavior Language Processing with Graph based Feature Generation for Fraud Detection in Online Lending Behavior Language Processing with Graph based Feature Generation for Fraud Detection in Online Lending”, Proceedings of WSDM workshop on Misinformation and Misbehavior Mining on the Web, (Jan. 1, 2018), URL: https://web.archive.org/web/20180329125033if_/http://snap.stanford.edu:80/mis2/files/MIS2_paper_26.pdf, (Aug. 7, 2019), XP055611538, 8 pages. |
Bean, D.M., Wu, H., Igbal, E., Dzahini, O., Ibrahim, Z.M., Broadbent, M., Stewart, R. and Dobson, R.J., 2017. Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records. Scientific reports, 7(1), pp. 1-11. |
Office Action (Final Rejection) dated Aug. 16, 2022 for U.S. Appl. No. 15/977,105 (pp. 1-17). |
Office Action (Non-Final Rejection) dated Aug. 1, 2022 for U.S. Appl. No. 16/052,293 (pp. 1-15). |
Office Action (Non-Final Rejection) dated Aug. 26, 2022 for U.S. Appl. No. 16/394,651 (pp. 1-8). |
Office Action (Non-Final Rejection) dated Sep. 15, 2022 for U.S. Appl. No. 17/535,511 (pp. 1-11). |
Zhao, Q., Li, Q. and Wen, J., 2018. Construction and application research of knowledge graph in aviation risk field. In MATEC Web of Conferences (vol. 151, p. 05003). EDP Sciences. |
Boris Sharchilev et al: “Finding Influential Training Samples for Gradient Boosted Decision Trees”, arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, Feb. 19, 2018 (Feb. 19, 2018), XP081223129. |
European Extended Search Report issued in EP19764112.9, dated Jun. 27, 2022, 11 pages. |
European Extended Search Report issued in EP19764112.9, dated Mar. 24, 2022, 14 pages. |
Kang et al., “A novel credit scoring framework for auto loan using an imbalanced-learning-based reject inference”. 2019 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr). May 4-5, 2019. DOI: 10.1109/CIFEr 2019.8759110 (Year: 2019). |
Marco Ancona et al: “Towards better understanding of gradient-based attribution methods for Deep Neural Networks”, arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, Mar. 7, 2018 (Mar. 7, 2018), XP081506780. |
Mukund Sundararajan et al: “Axiomatic Attribution for Deep Networks”, arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, Mar. 4, 2017 (Mar. 4, 2017), XP080754192. |
Office Action (Non-Final Rejection) dated May 24, 2022 for U.S. Appl. No. 16/688,789 (pp. 1-17). |
Office Action (Non-Final Rejection) dated Jun. 24, 2022 for U.S. Appl. No. 17/385,452 (pp. 1-14). |
Wikipedia entry on “Autoencoder”. https://en.wikipedia.org/wiki/Autoencoder Downloaded Jun. 15, 2022 (Year: 2022). |
Office Action (Non-Final Rejection) dated Mar. 2, 2022 for U.S. Appl. No. 16/434,731 (pp. 1-6). |
Bit Array, Wikipedia Search, May 5, 2021 at https://en.wikipedia.org/wiki/Bit_array. |
Genetic algorithm, Wikipedia Search, May 5, 2021 at https://en.wikipedia.org/wiki/Gentic_algorithm. |
International Search Report and the Written Opinion, Application No. PCT/US14/014047, dated May 5, 2014. |
“On the Convergence of Generalized Hill Climbing Algorithms” by A.W. Johnson et al. copyright 2002, Elsevier Science B.V., Discrete Applied Mathematics (Year: 2002). |
“International Search Report and Written Opinion of the ISA, dated Sep. 16, 2019, for application No. PCT/US19/029148.” |
Abadi, Martin , et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems”, Preliminary White Paper, Nov. 9, 2015. |
Breiman, Leo , et al., “Random Forests”, Machine Learning, 45, 5-32, 2001. |
Chen, Jiahao, Fair lending needs explainable models for responsible recommendation Proceedings of the Second Workshop on Responsible Recommendation, 2018 (Year: 2018). |
Chen, Tianqi , et al., “XGBoost: A Scalable Tree Boosting System”, KDD '16, Aug. 13-17, 2016, San Francisco, CA, USA. |
Cortes, Corinna , et al., “Support-Vector Networks”, AT&T Labs—Research, USA, Journal Machine Learning, vol. 20, Issue 3, Sep. 1995. |
Data Bias and Algorithmic Discrimination University of Montreal, 2017 (Year: 2017). |
Demaine, Erik D., et al., “Correlation clustering in general weighted graphs”, Theorectical Computer Science 361 2006)172-187. |
Friedman, Jerome H., “Greedy Function Approximation: A Gradient Boosting Machine”, IMS 1999 Reitz Lecture, Feb. 24, 1999. |
Garcia-Pedradas, Nicolas , et al., “Nonlinear Boosting Projections for Ensemble Contruction”, Journal of Machine Learning Research 8 (2007) 1-33. |
Gates, Susan Wharton et al., 4/3 Automated Underwriting: Friend or Foe to Low-Mod Households and Neighborhoods? Building Assets, Building Credit, Symposium, Nov. 2003 (Year: 2003). |
Geurts, Pierre , et al., “Extremely randomized trees”, Springer Science + Business Media, Inc., pub. online Mar. 2, 2006. |
International Preliminary Report on Patentability issued in PCT/US2013/060208, dated Mar. 24, 2015, 8 pages. |
International Search Report and Written Opinion for Application No. PCT/US18/44874, mailed Oct. 10, 2018. |
International Search Report and Written Opinion for application No. PCT/US20/062235, dated Mar. 10, 2021. |
International Search Report and Written Opinion for application No. PCT/US20/062271 dated Feb. 26, 2021. |
International Search Report and Written Opinion for International Application No. PCT/US18/030966, dated Jul. 20, 2018. |
International Search Report and Written Opinion issued in PCT/US2020/062235, dated Mar. 10, 2021, 8 pages. |
International Search Report and Written Opinion issued in PCT/US2020/062271, dated Feb. 26, 2021, 8 pages. |
International Search Report and Written Opinion of the ISA for application No. PCT/20/23370 dated Jun. 18, 2020. |
International Search Report and Written Opinion of the ISA, dated Jul. 5, 2019, for application No. PCT/US19/021381. |
International Search Report and Written Opinion of the ISA, dated Aug. 23, 2019, for application No. PCT/US19/036049. |
International Search Report issued in PCT/US2013/060208, dated Jan. 7, 2014, 2 pages. |
Ivanov, Alexei, et al., “Kolmogorov-Smirnov test for feature selection in emotion recognition from speech”, IEEE International Conference on acoustics, speech and signal processing (ICASSP), 2012, pp. 5125-5128. |
Johnson, Kristen , et al., “Artificial Intelligence, Machine Learning, and Bias in Finance: Toward Responsible Innovation”, Fordham Law Review, vol. ** , Issue 2, Article 5,2019, pp. 499-529. |
Kamkar, Sean Javad, “Mesh Adaption Strategies for Vortex-Dominated Flows”, Standard University, Feb. 2011. |
Lippert, John , “ZestFinance Issues small, high-rate loans, uses big data to weed out deadbeats”, The Washington Post, Oct. 12, 2014. |
Louppe, Gilles , et al., “Learning to Pivot with Adversarial Networks”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, https://papers.nips.cc/paper/6699-learning-to-pivot-with-adversarial-networks.pdf. |
Lundberg, Scott M., et al., “A Unified Approach to Interpreting Model Predictions”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, Nov. 25, 2017. |
Lundberg, Scott M., et al., “Consistent Individualized Feature Attribution for Tree Ensembles”, University of Washington, Mar. 7, 2019. |
Merrill, Douglas C, et al., “Systems and Methods for Decomposition of Non-Differentiable and Differentiable Models”, U.S. Appl. No. 16/434,731, filed Jun. 7, 2019. |
Merrill, Douglas C, et al., “Systems and Methods for Enriching Modeling Tools and Infrastructure with Semantics”, U.S. Appl. No. 16/394,651, filed Apr. 25, 2019. |
Modarres, Ceena , et al., “Towards Explainable Deep Learning for Credit Lending: A Case Study”, arXiv:1811.06471v2 [cs.LG], Nov. 30, 2018. |
Mondarres, Ceena et al., Towards Explainable Deep Learning for Credit Lending: A Case Study Proc. Workshop Challenges Opportunities AI Financial Services: Impact Fairness Explainability Accuracy Privacy (NIPS), 2018 ( Year: 2018). |
Nesiba, Reynold F., “The Color of Credit: Mortgage Discrimination, Research Methodology, and Fair-Lending Enforcement”, Journal of Economic Issues, 37 (3), 813-815, 2003. |
Office Action (Final Rejection) dated Nov. 18, 2021 for U.S. Appl. No. 16/052,293 (pp. 1-18). |
Office Action (Final Rejection) dated Dec. 7, 2021 for U.S. Appl. No. 16/109,545 (pp. 1-17). |
Office Action (Non-Final Rejection) dated Dec. 16, 2021 for U.S. Appl. No. 15/977,105 (pp. 1-19). |
Office Action (Notice of Allowance and Fees Due (PTOL-85)) dated Nov. 19, 2021 for U.S. Appl. No. 16/292,844 (pp. 1-8). |
Ribeiro, Marco Tulio et al., Why Should | Trust You?—Explaining Predictions of Any Classifier ACM, 2016 (Year: 2016). |
Richardson, L. F., “The approximate arithmetical solution by finite differences of physical problems including differential equations, with an application to the stresses in a masonry dam”, Philosophical Transactions of the Royal Society A. 210 (459-470): 307-357. doi:10.1098/rsta.1911.0009, Nov. 2, 1909. |
Richardson, L. F., “The deferred approach to the limit”, Philosophical Transactions of the Royal Society A. 226 (636-646): 299-349. doi:10.1098/rsta.1927.0008, Oct. 14, 1926. |
Rumelhart, David E., et al., “Learning representations by back-propagating errors”, Nature vol. 323, Oct. 9, 1986. |
Saabas, Ando , “Diving into data, a blog on machine learning, data mining and visualization, Interpreting random forests”, http://blog.datadive.net/interpreting-random-forests/ (spec), Oct. 19, 2014. |
Saabas, Ando , “Diving into data, a blog on machine learning, data mining and visualization, Random forest interpretation with scikit-learn”, http://blog.datadive.net/random-forest-interpretation-with-scikit-learn/ (spec), Aug. 12, 2015. |
Saabas, Ando , “Diving into data, a blog on machine learning, data mining and visualization, Random forest interpretation—conditional feature contributions”, http://blog.datadive.net/random-forest-interpretation-conditional-feature-contributions/ (spec), Oct. 26, 2016. |
Shapley, L. S. , “A Value for n-Person Games”, p. 295, The Rand Corporation, Mar. 18, 1952. |
Strumbelj, Eric , et al., “An Efficient Explanation of Individual Classifications using Game Theory”, Journal of Machine Learning Research 11 (2010) 1-18. |
Sundararajan, Mukund , et al., “Axiomatic Attribution for Deep Networks”, Proceeding of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, Jun. 13, 2017. |
Tonk, Stijn, “Towards fairness in ML with adversarial networks”, http://godatadriven.com/, Apr. 27, 2019. |
Wattenber, Martin et al., Attacking discrimination with smarter machine learning Google Research, 2016 (Year: 2016). |
Wolpert, David H., “Stacked Generalization”, Original contribution: Stacked generalization. Neural Netw., 5(2):241 259, Feb. 1992. |
ZestFinance releases new software tool to reduce bias in AI-powered credit scoring models: New fairness filter can put 170,000 more minority families into homes. (Mar. 19, 2019). PR Newswire Retrieved from https://dialog.proquest.com/professional/docview/2193594346?accountid=131444 (Year: 2019). |
Zhang, et al., 2018. “Mitigating Unwanted Biases with Adversarial Learning,” In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (AIES '18). Association for Computing Machinery, New York, NY, USA, 335-340. |
Zhao, Zheng , et al., “On Similarity Preserving Feature Selection”, IEEE Transactions on Knowledge and Data Engineering 25,2011, pp. 619-632. |
Merrill, John W. L. , et al., “Generalized Integrated Gradients: A practical method for explaining diverse ensembles”, Journal of Machine Learning Research Under Review (2019), 29 pages. |
Ward, et al., “An exploration of the influence of path choice in game-theoretic attribuution algorithms,” Journal of Machine Learning Research Under Review (2020), 21 pages. |
International Preliminary Report on Patentability dated Aug. 4, 2015 in corresponding PCT Application No. PCT/US2014/014047. |
International Search Report dated May 5, 2014 in corresponding PCT Application No. PCT/US2014/014047. |
“Feature Selection”, Wikipedia and obtained in the Wayback machine at URL https://en.wikipedia.org/wiki/Feature_selection, Feb. 25, 2021. |
Gehrlein, William et al., “A two-stage least cost credit scoring model”, 1997, Annals of Operations Research, pp. 159-171, Jul. 15, 2016 00:00:00.0. |
Bittencourt, H.R. , et al., “Feature Selection by Using Classification and Regression Trees (CART)”, dated Mar. 23, 2004. |
Strobl, Carolin , et al., “Conditional Variable Importance for Random Forests”, BMC Bioinformatics 2008, 9:307, published Jul. 11, 2008. |
Tuv, Eugene , et al., “Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination”, Journal of Machine Learning Research, pp. 1341-1366, Jul. 2009. |
“Feature Selection”, Wikipedia and obtained in the Wayback machine at URL http://en.wikipedia.org/wiki/Feature_selection, Sep. 1, 2011. |
Number | Date | Country | |
---|---|---|---|
20230334017 A1 | Oct 2023 | US |
Number | Date | Country | |
---|---|---|---|
62065445 | Oct 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17223698 | Apr 2021 | US |
Child | 18337164 | US | |
Parent | 16157960 | Oct 2018 | US |
Child | 17223698 | US | |
Parent | 14886926 | Oct 2015 | US |
Child | 16157960 | US |