1. Field of the Invention
The present invention relates to loan valuation and more specifically to a method and apparatus for computing a loan quality score for property loans. A loan quality score may be used by a lender in determining whether or not to issue or purchase a loan on a particular property.
2. Background of the Invention
There exists a need in the loan industry for objective criteria to determine the likelihood that a loan may not be repaid due to fraudulent misrepresentation of the collateral. Determining this accurately in a rapidly growing or fluctuating property market is only more difficult. Many times the appraisal supporting the loan application for a particular property is either inaccurate, exaggerated or an outright attempt at loan fraud. As a result, a lender on a particular property, either for a home purchase loan or for a mortgage on a home, would like to have some valuable indicator of the likelihood that a loan fraud is about to occur. A method is needed whereby a lender may evaluate the accuracy and validity of a particular loan request and to provide ready access to the information that evaluation is based upon for each target property.
It is therefore an object of the present invention to provide a means by which the quality of a loan and the valuation for the property being given may be tested for validity and accuracy. It is another object of the present invention to use numerous variables to provide as accurate a loan quality score as possible for use by a lender for a loan on a residential or other property.
A method and apparatus for computing a loan quality score using numerous metrics that have been found to relate to the likelihood of property overvaluation or loan fraud. The present invention collects relevant data, either from automated valuation models, publicly available records or other sources, performs calculations based upon that data and then provides a comprehensive loan quality score. In the preferred embodiment, details of the data used to create the loan quality score are also provided.
a is a table depicting the values of the variables and calculations used in an example loan quality score generation.
b is a table depicting the calculation of the Loan Quality Score using the Logit from
a is a table depicting the values of the variables and calculations used in another example loan quality score generation of the preferred embodiment.
b is a table depicting the calculation of the Loan Quality Score using the Logit from
The present invention provides a method and apparatus for computing a loan quality score for a loan on a residential or other property. Because the loan industry is one in which numerous loan applications must be quickly approved or denied based upon limited knowledge of the subject property being lent upon, a method is needed by which the sufficiency and validity of the collateral for the loan may be evaluated. This invention addresses that need by calculating a loan quality score, based upon numerous criteria. The loan quality score is calculated in different ways if particular information is missing for a subject property. In the preferred embodiment, the data upon which the quality score is based is also provided.
Referring first to
The computation processor 12 is responsible for performing the calculations associated with applying the algorithms used to calculate the loan quality score to the data. The temporary memory 36 is used to store the variables as used in the equation and other temporary data prior to use or output. The report generator 14 is used to format the data into a report as described below. The output connector 16 is used to connect the loan quality scoring data structure to outside output methods. This could include connections to the Internet 32, typically using traditional means such as output to a dynamically generated webpage. There may also be alternative output 34 such as output of the report or loan quality score to a fax machine or other output device.
The input connector 18 receives input 24 from a keyboard, a mouse, the internet or any number of other input devices. The database connector 20 connects the loan quality scoring data structure to various databases 26. The automated valuation model connector 22 connects the loan quality scoring data structure to any number of automated valuation models (commonly referred to as AVMs), such as automated valuation models X in element 28 and Y in element 30. These are used to gather value estimations for the target properties that the loan quality score is being generated.
Referring next to
In the preferred embodiment, the next step in the loan quality scoring process is to estimate the value using a particular automated valuation model. This step is shown in element 40 of
Next, in one embodiment, the loan score computation method searches the user input of the seller name(s) for certain key words known to correlate with loan fraud. This is also known as a “string search.” This step is depicted in element 42 in
In the preferred embodiment, the next step is to apply the loan quality score algorithm as depicted in element 44. The algorithm utilizes several variables. They are as follows:
The algorithm in this embodiment also considers the ratio of user-submitted value, US, to the AVM valuation, AVM. An algorithm is applied using these variables. This algorithm is as follows:
Where:
Logit is the natural logarithm of the odds ratio, namely p/(1−p), where P is the probability that the loan is fraudulent.
RS is the risky seller binary dummy variable. If the seller is risky, then the binary variable is set to 1. If the seller is not risky, then the binary variable is set to 0.
TS is the number of times the property has been sold in the past three years.
RF is a binary dummy variable for refinance loans. If the loan is a refinance, the binary variable is set to 1, otherwise it is set to 0.
AO is a binary dummy variable for absentee owner. If the purchaser does not intend to live in the subject property after purchase, this binary variable is set to 1, otherwise it is set to 0.
AVM is the automated valuation model's estimate of value.
EX is the binary dummy variable when user-submitted value exceeds automated valuation model valuation. If the user-submitted value exceeds the automated valuation, this binary variable is set to 1, otherwise it is set to 0.
EX50 is the binary dummy variable when user-submitted value exceeds automated valuation model valuation by 50% or more. If the user-submitted value exceeds the automated valuation by 50% or more, this binary variable is set to 1, otherwise it is set to 0.
NARM is the binary dummy variable for a non-arm's length transfer. If the sale appears to not be at arms length, that is, between family members or individuals of the same name, then this binary variable is set to 1, otherwise it is set to 0.
AG is the age of the target property.
LA is the loan amount.
AV is the appraised value.
US is the user-submitted value.
SF is the square footage of the target property.
Each of these variables are derived, either directly from the user input or by examining data in a database collected over time which includes known fraudulent loan requests. Also, some variables are included after calculating their relevance based upon the user input data or database data. The entire equation has been derived using techniques designed to take each variable selected into account and has found that the coefficients associated with them provide the most accurate representation of their relevance in predicting potential loan fraud.
The equation used in this and in the preferred embodiment and are derived using a sample set of fraudulent and non-fraudulent loan data. Statistical analysis is used to derive the above equation and it has been found to be the best mode. However, alternative equations may exist and may be used. In alternative embodiments of this invention, one or more of the required variables listed above may not be available or the user may not input them. In these cases, a different equation is used, one derived using statistical analysis without the variable or variables that are unavailable. In another alternative embodiment, additional variables or fewer variables will be included. Additional statistical analysis will be required to derive an equation for each group of data used to predict fraudulent loan applications.
Once the Logit is computed, the loan quality score is computed, as depicted in element 46, by multiplying the Logit, as computed above, and a predetermined constant and then subtracting that result from another constant. In this embodiment, these two constants are determined by comparing scores produced using the present invention with scores produced for loans known to be fraudulent and using statistical analysis to derive the correct constants. In this embodiment, the following equation is used to compute the loan quality score:
Loan Quality Score=500−(33*Logit)
Referring now to
RS, the risky seller binary variable is 0—the buyer and seller are not risky as depicted in element 52.
TS, the number of times the property has been sold in the past three years is 2 as depicted in element 54.
RF, the binary variable for a refinance loan is 0—it is not a refinance loan as depicted in element 56.
AO, the binary variable for absentee owner is 1—the borrower does not intend to occupy the property as depicted in element 58.
AVM, the automated valuation model's estimate of value is $56,000 as depicted in element 60.
EX, the binary variable when user-submitted value exceeds automated valuation model valuation is 1—the user-submitted value exceeds the automated valuation model value as depicted in element 62.
EX50, the binary variable when user-submitted value exceeds automated valuation model valuation by more than 50% is 0—the appraised value does not exceed the automated valuation model valuation by more than 50% as depicted in element 64.
NARM, the binary variable for a non-arm's length transfer is 0—the transaction appears to be arm's length between the buyer and seller as depicted in element 66.
AG, the age of the target property is 77 years as depicted in element 68.
LA, the loan amount is $48,800 as depicted in element 70.
US, the user-submitted value is $61,000 as depicted in element 72.
SF, the square footage of the target property is 2072 as depicted in element 74.
Then the equation would then be:
The sum of each of these is:
Logit=3.744 (in element 102)
Referring now to
This results in a loan quality score of 376.
In another embodiment, a different algorithm is applied in the step depicted in element 44 of
The variables used in this embodiment are as
The algorithm in this embodiment also considers the ratio of user-submitted appreciation to the median appreciation in a predetermined geographic area during the same period. In this embodiment, the predetermined geographic area is a census tract. This ratio is known as the appreciation variance ratio or AVR. The following algorithm, used in this embodiment, has been found to be the best mode, given the data available currently. This algorithm is applied using the above-listed variables. The algorithm in this embodiment is as follows:
Where:
Logit is the natural logarithm of the odds ratio, namely p/(1−p), where P is the probability that the loan is fraudulent.
PL is the percent of households earning less than a specified amount. In this embodiment, this amount is $25,000 per year.
TS is the number of times the property has been sold in the past three years.
RF is a binary dummy variable for refinance loans. If the loan is a refinance, the binary variable is set to 1, otherwise it is set to 0.
AVM is the automated valuation model's estimate of value.
EX is the binary dummy variable when user-submitted value exceeds automated valuation model valuation. If the user-submitted value exceeds the automated valuation, this binary variable is set to 1, otherwise it is set to 0.
AG is the age of the target property.
LA is the loan amount.
AVR is the ratio of the appreciation in value, as given by the user, compared to the appreciation in value of the median home price in a predetermined geographic area. In this embodiment, a census tract is used, however alternative embodiments may use other predetermined geographic areas. Theoretically, this ratio should be one to one. The larger the disparity in suggested subject property appreciation in value over median home price appreciation in value, the more likely fraud is to be occurring. By using the census tract, the homes by which the subject property is judged is very narrow and thus very accurate. This variable has been shown to have a high correlation to fraud in that the user's suggested property value appreciation is one of the main ways in which loan fraud is carried out. This variable provides an accurate measure of that appreciation when considered in light of the median appreciation in the narrow range of properties surrounding the subject property.
Once the Logit is computed, as above, the loan quality score is computed, as depicted in element 46, by multiplying the Logit, as computed above, and a predetermined constant and then subtracting that result from another constant. In this embodiment, these two constants are determined by comparing scores produced using the present invention with scores produced for loans known to be fraudulent and using statistical analysis to derive the correct constants. In the preferred embodiment, the following equation is used to compute the loan quality score:
Loan Quality Score=500−(31*Logit)
Referring now to
PL, the percent of household income below a certain number, in the preferred embodiment, $25,000 is 20% as depicted in element 108.
TS, the number of times the property has been sold in the past two years is 2 as depicted in element 110.
RF, the binary variable for a refinance loan is 0—it is not a refinance loan as depicted in element 112.
AVM, the automated valuation model's estimate of value is $56,000 as depicted in element 114.
EX, the binary variable when user-submitted value exceeds automated valuation model valuation is 1—the appraised valueexceeds the automated valuation model value as depicted in element 116.
AG, the age of the target property is 77 years as depicted in element 118.
LA, the loan amount is $48,800 as depicted in element 120.
AVR, the appreciation variance ratio is 1.2 as depicted in element 122.
The sum of each of these is:
Logit=0.68164 (in element 142)
Referring now to
This results in a loan quality score of approximately 479.
The next step in the preferred embodiment is to provide this score to the user as depicted in element 48. Alternative scores may be computed, particularly if the user is missing portions of the data required by either equation. If some data is missing, alternative equations will be used, dependant upon which portions of data are missing. These alternative embodiments are not ideal, but will be used as-necessary. Using one the above equations or an alternative equation a score between 0 and 1000 is computed. Using the above equation a lower and higher score than 0 and 1000 are possible, so boundaries are created such that if the scores are lower or higher than these lower and upper bounds, they are automatically set at those bounds. This score is provided to the user. A low score on this scale is a questionable loan. A low score would be a score from zero to 500. A marginal score would be a score from 500 to 550. In this range the loan is questionable, but not unsatisfactory. Finally, a score above 550 would be a satisfactory score. Receiving a particular score is not a predictor of fraud, but a method based on statistics of providing some indication of an increased likelihood for real estate loan fraud. Therefore, using the result from above, a loan quality score of 376, as depicted in the first embodiment is within the unsatisfactory range. A loan quality score of 479, as depicted in the second embodiment, is also within the unsatisfactory range. Therefore, the likelihood of fraud is high with both of these loan applications.
In the final step in the practice of this invention the following are provided: (1) a report including the score, (2) each of the user-inputted variables and their values, (3) other indicators of potential fraud and (4) neighboring sales data. These are provided in a report format as depicted in element 50. In the preferred embodiment, the user input is received via the Internet and the report is provided over the Internet. In some alternative embodiments, this step may not be completed, and the score alone may be provided. Alternatively, only portions of the report or portions of the data used to derive the report may be provided.
Accordingly, a method and apparatus for computing a loan quality score has been described. It is to be understood that the foregoing description has been made with respect to specific embodiments thereof for illustrative purposes only. The overall spirit and scope of the present invention is limited only by the following claims, as defined in the foregoing description.